Validate S3 signature Python - python-3.x

I want to validate the signature of an S3 (hosted at digitalocean) presigned URL via Python. As far as I know, the signature consists of the full URL, with the secret key.
I've already tried things like AWS S3 presigned urls with boto3 - Signature mismatch, but that results in a different signature.
I want to check the signature given in the URL (of an image for example) by recreating it with the hashing algorithm.
How would I go about doing this?

I had the same problem and was hoping the boto package would provide an easy way to do this, but unfortunately it doesn't.
I also tried to use boto to create the same signature base on the url, but the problem is the timestamp (X-Amz-Date in the url)
To get the exact same signature, the timestamp provided in the url needs to be used for generating.
I went down the rabbit hole trying to 'override' the datetime but it seems like it's impossible.
So what's left is generating the signature from scratch, like you said you tried. The code in the question you linked does work but it's not straightforward.
Inspired by that link and the boto3 source, this is what I've created and it seems to work:
from urllib.parse import urlparse, parse_qs, urlencode, quote
import hashlib
import hmac
from django.conf import settings
from datetime import datetime, timedelta
def validate_s3_url(url: str, method='GET'):
"""
This check whether the signature in the given S3 url is valid,
considering the other parts of the url.
This requires that we have access to the (secret) access key
that was used to sign the request (the access key ID is
available in the url).
"""
parts = urlparse(url)
querydict = parse_qs(parts.query)
# get relevant query parameters
url_signature = querydict['X-Amz-Signature'][0]
credentials = querydict['X-Amz-Credential'][0]
algorithm = querydict['X-Amz-Algorithm'][0]
timestamp = querydict['X-Amz-Date'][0]
signed_headers = querydict['X-Amz-SignedHeaders'][0]
expires = querydict['X-Amz-Expires'][0]
timestamp_datetime = datetime.strptime(timestamp, "%Y%m%dT%H%M%SZ")
if timestamp_datetime + timedelta(
seconds=int(expires) if expires.isdigit() else 0) < datetime.utcnow():
return False
# if we have multiple access keys we could use access_key_id to get the right one
access_key_id, credential_scope = credentials.split("/", maxsplit=1)
host = parts.netloc
# important: in Python 3 this dict is sorted which is essential
canonical_querydict = {
'X-Amz-Algorithm': [algorithm],
'X-Amz-Credential': [credentials],
'X-Amz-Date': [timestamp],
'X-Amz-Expires': querydict['X-Amz-Expires'],
'X-Amz-SignedHeaders': [signed_headers],
}
# this is optional (to force download with specific name)
# if used, it's passed in as 'ResponseContentDisposition' Param when signing.
if 'response-content-disposition' in querydict:
canonical_querydict['response-content-disposition'] = querydict['response-content-disposition']
canonical_querystring = urlencode(canonical_querydict, doseq=True, quote_via=quote)
# build the request, hash it and build the string to sign
canonical_request = f"{method}\n{parts.path}\n{canonical_querystring}\nhost:{host}\n\n{signed_headers}\nUNSIGNED-PAYLOAD"
hashed_request = hashlib.sha256(canonical_request.encode('utf-8')).hexdigest()
string_to_sign = f"{algorithm}\n{timestamp}\n{credential_scope}\n{hashed_request}"
# generate signing key from credential scope.
signing_key = f"AWS4{settings.AWS_SECRET_ACCESS_KEY}".encode('utf-8')
for message in credential_scope.split("/"):
signing_key = hmac.new(signing_key, message.encode('utf-8'), hashlib.sha256).digest()
# sign the string with the key and check if it's the same as the one provided in the url
signature = hmac.new(signing_key, string_to_sign.encode('utf-8'), hashlib.sha256).hexdigest()
return url_signature == signature
This uses django settings to get the secret key but really it could come from anywhere.

Related

aws_encryption_sdk does not return same string on decrypting in python?

I am looking to encrypt some secret text using aws_encryption_sdkin python .However I see some unwanted character while decrypting. I have used java version of sdk before I did not see any this kind of issue .Below is my code .
import aws_encryption_sdk
from aws_encryption_sdk import CommitmentPolicy
import botocore.session
import pytest
import base64
def cycle_string(key_arn, source_plaintext, botocore_session=None):
client = aws_encryption_sdk.EncryptionSDKClient(commitment_policy=CommitmentPolicy.REQUIRE_ENCRYPT_REQUIRE_DECRYPT)
kms_kwargs = dict(key_ids=[key_arn])
print(kms_kwargs)
if botocore_session is not None:
kms_kwargs["botocore_session"] = botocore_session
master_key_provider = aws_encryption_sdk.StrictAwsKmsMasterKeyProvider(**kms_kwargs)
# Encrypt the plaintext source data
ciphertext, encryptor_header = client.encrypt(source=source_plaintext, key_provider=master_key_provider)
# print(ciphertext, encryptor_header)
# Decrypt the ciphertext
encrrtext=base64.b64encode(ciphertext)
encrciphertext=base64.b64decode(encrrtext)
cycled_plaintext, decrypted_header = client.decrypt(source=encrciphertext, key_provider=master_key_provider)
# print(cycled_plaintext, decrypted_header)
print(encrrtext)
print(cycled_plaintext)
print(source_plaintext)
# Verify that the "cycled" (encrypted, then decrypted) plaintext is identical to the source plaintext
assert cycled_plaintext == source_plaintext
# Verify that the encryption context used in the decrypt operation includes all key pairs from
# the encrypt operation. (The SDK can add pairs, so don't require an exact match.)
#
# In production, always use a meaningful encryption context. In this sample, we omit the
# encryption context (no key pairs).
assert all(
pair in decrypted_header.encryption_context.items() for pair in encryptor_header.encryption_context.items()
)
plaintext = "hello there"
cmk_arn = "<arn>"
cycle_string(key_arn=cmk_arn, source_plaintext=plaintext, botocore_session=botocore.session.Session())
O/P:
b'hello there'
hello there
I was expecting it to return same text as source .Any help on this would be appreciated
Seems like the SDK returns a byte-string. When printing python denotes these by adding the b'' part. You can convert the byte string to a normal string by adding cycled_plaintext = cycled_plaintext.decode('UTF-8') before the assertion.

Acquire Keyvault Secret within a httptrigger and Use it to Acquire Info to be output by Function-Python

I have the following code which I use to acquire a secret, use secret to log into portal and download a csv table. This works ok outside a function.
import pandas as pd
import pandas as pd
from arcgis.gis import GIS
from azure.identity import DefaultAzureCredential
from azure.keyvault.secrets import SecretClient
credential = DefaultAzureCredential()
secret_client = SecretClient(vault_url="https://xxxx-dev-vault.vault.azure.net/", credential=credential)
secret = secret_client.get_secret("Secret")
#Log into portal
gis = GIS("https://url.com", "Username", secret.value)
#Extracting Table
item=gis.content.get('content_id')
table=item.layers[0].query().sdf
I need to include this bit of code in in my httptrigger so that the function logs into portal, extracts csv/table so that the table is returned as a json body of the trigger response or is stored into a blob. How can I achieve this?
I initially thought I could achieve this by integrating the vault in the http trigger in this post. However, I ended up with the Secret being returned instead and I have been unable to use the secret within the function.
I dont mind, even an example logging into an email account or any other portal will suffice provided the secret password is acquired within the function runtime. Ultimately, I am interested in understanding how to retrieve a secret and use it within a function to power/resource a function output.
The code is what I test in my side with a csv file in local. But I'm not sure if the line dict_reader = csv.DictReader(table) works in your side. You can do some test and modify the code by yourself if it show error.
import logging
import azure.functions as func
from arcgis.gis import GIS
import csv
import json
def main(req: func.HttpRequest) -> func.HttpResponse:
logging.info('Python HTTP trigger function processed a request.')
# do some configuration in application settings of your function app as previous post mentioned, and then we can get the secret of key vault.
# secret = os.environ["testkeyvault"]
# gis = GIS("https://url.com", "Username", secret)
#Extracting Table
# item=gis.content.get('content_id')
# table=item.layers[0].query().sdf
# The four lines below is what I test in my side, I use a csv file in local and convert it to json and use "return func.HttpResponse(json_from_csv)" at the end of the code. The function will response json.
file = open("C:\\Users\\Administrator\\Desktop\\user.csv", "r")
dict_reader = csv.DictReader(file)
dict_from_csv = list(dict_reader)[0]
json_from_csv = json.dumps(dict_from_csv)
# Your code should be like the three lines below. But as I didn't test with the csv file from gis, so I'm not sure if "dict_reader = csv.DictReader(table)" can work.
# dict_reader = csv.DictReader(table)
# dict_from_csv = list(dict_reader)[0]
# json_from_csv = json.dumps(dict_from_csv)
return func.HttpResponse(json_from_csv)
=============================Update============================
Change the code to match OP's requirements(And do not forget deploy the function to azure, or we can't get the keyvault secret on azure):

Signing and Verifying of Signature using Pycryptodome always fails

Hi I'm using the Pycryptodome package to try and verify signatures of transactions in a Blockchain. My issue is that when trying to add a new transaction, I first create a signature to be passed into a verify transaction method but for some reason it always fails even though the logic seems to be right when I compare it to the documentation. If anyone could point me where I'm going wrong it would be much appreciated. I have 3 methods that handle all of this and i'm not sure where the issue is
The generate keys method
def generate_keys(self):
# generate private key pair
private_key = RSA.generate(1024, Crypto.Random.new().read)
# public key comes as part of private key generation
public_key = private_key.publickey()
# return keys as hexidecimal representation of binary data
return (binascii.hexlify(public_key.exportKey(format='DER')).decode('ascii'), binascii.hexlify(private_key.exportKey(format='DER')).decode('ascii'))
The sign transaction method
def sign_transaction(self, sender, recipient, amount, key):
# convert transaction data to SHA256 string
hash_signer = SHA256.new(
(str(sender) + str(recipient) + str(amount)).encode('utf-8'))
# sign transaction
signature = pkcs1_15.new(RSA.importKey(
binascii.unhexlify(key))).sign(hash_signer)
# return hexidecimal representation of signature
return binascii.hexlify(signature).decode('ascii')
and the verify transaction method
#staticmethod
def verify_transaction(transaction):
# convert public key back to binary representation
public_key = RSA.importKey(binascii.unhexlify(
transaction.sender))
try:
# create signature from transaction data
hash_signer = SHA256.new(
(str(transaction.sender) + str(transaction.recipient) + str(transaction.amount)).encode('utf-8'))
pkcs1_15.new(public_key).verify(
hash_signer, binascii.unhexlify(transaction.signature))
return True
except ValueError:
return False
Once i've generated my key pair and attempt to use them to sign and verify transactions it always fails. I know this because it always returns false from the verify method leading me to believe a value error is always raised. Thanks in advance hopefully someone can help me out.

Invalid hash, timestamp, and key combination in Marvel API Call

I'm trying to form a Marvel API Call.
Here's a link on authorization:
https://developer.marvel.com/documentation/authorization
I'm attempting to create a server-side application, so according to the link above, I need a timestamp, apikey, and hash url parameters. The hash needs be a md5 hash of the form: md5(timestamp + privateKey + publicKey) and the apikey url param is my public key.
Here's my code, I'm making the request in Python 3, using the request library to form the request, the time library to form the timestamp, and the hashlib library to form the hash.
#request.py: making a http request to marvel api
import requests;
import time;
import hashlib;
#timestamp
ts = time.time();
ts_str = str(float(ts));
#keys
public_key = 'a3c785ecc50aa21b134fca1391903926';
private_key = 'my_private_key';
#hash and encodings
m_hash = hashlib.md5();
ts_str_byte = bytes(ts_str, 'utf-8');
private_key_byte = bytes(private_key, 'utf-8');
public_key_byte = bytes(public_key, 'utf-8');
m_hash.update(ts_str_byte + private_key_byte + public_key_byte);
m_hash_str = str(m_hash.digest());
#all request parameters
payload = {'ts': ts_str, 'apikey': 'a3c785ecc50aa21b134fca1391903926', 'hash': m_hash_str};
#make request
r = requests.get('https://gateway.marvel.com:443/v1/public/characters', params=payload);
#for debugging
print(r.url);
print(r.json());
Here's the output:
$python3 request.py
https://gateway.marvel.com:443/v1/public/characters...${URL TRUNCATED FOR READABILITY)
{'code': 'InvalidCredentials', 'message': 'That hash, timestamp, and key combination is invalid'}
$
I'm not sure what exactly is causing the combination to be invalid.
I can provide more info on request. Any info would be appreciated. Thank you!
EDIT:
I'm a little new to API calls in general. Are there any resources for understanding more about how to perform them? So far with my limited experience they seem very specific, and getting each one to work takes a while. I'm a college student and whenever I work in hackathons it takes me a long time just to figure out how to perform the API call. I admit I'm not experienced, but in general does figuring out new API's require a large learning curve, even for individuals who have done 10 or so of them?
Again, thanks for your time :)
I've also had similar issues when accessing the Marvel API key. For those that are still struggling, here is my templated code (that I use in a jupyter notebook).
# import dependencies
import hashlib #this is needed for the hashing library
import time #this is needed to produce a time stamp
import json #Marvel provides its information in json format
import requests #This is used to request information from the API
#Constructing the Hash
m = hashlib.md5() #I'm assigning the method to the variable m. Marvel
#requires md5 hashing, but I could also use SHA256 or others for APIS other
#than Marvel's
ts = str(time.time()) #This creates the time stamp as a string
ts_byte = bytes(ts, 'utf-8') #This converts the timestamp into a byte
m.update(ts_byte) # I add the timestamp (in byte format) to the hash
m.update(b"my_private_key") #I add the private key to
#the hash.Notice I added the b in front of the string to convert it to byte
#format, which is required for md5
m.update(b"b2aeb1c91ad82792e4583eb08509f87a") #And now I add my public key to
#the hash
hasht = m.hexdigest() #Marvel requires the string to be in hex; they
#don't say this in their API documentation, unfortunately.
#constructing the query
base_url = "https://gateway.marvel.com" #provided in Marvel API documentation
api_key = "b2aeb1c91ad82792e4583eb08509f87a" #My public key
query = "/v1/public/events" +"?" #My query is for all the events in Marvel Uni
#Building the actual query from the information above
query_url = base_url + query +"ts=" + ts+ "&apikey=" + api_key + "&hash=" +
hasht
print(query_url) #I like to look at the query before I make the request to
#ensure that it's accurate.
#Making the API request and receiving info back as a json
data = requests.get(query_url).json()
print(data) #I like to view the data to make sure I received it correctly
Give credit where credit is due, I relied on this blog a lot. You can go here for more information on the hashlib library. https://docs.python.org/3/library/hashlib.html
I noticed in your terminal your MD5 hash is uppercase. MD5 should output in lowercase. Make sure you convert to that.
That was my issue, I was sending an uppercase hash.
As mentioned above, the solution was that the hash wasn't formatted properly. Needed to be a hexadecimal string and the issue is resolved.
Your final URL should be like this:
http:// gateway.marvel.com/v1/public/characters?apikey=(public_key)&ts=1&hash=(md5_type_hash)
So, you already have public key in developer account. However, how can you produce md5_type_hash?
ts=1 use just 1. So, your pattern should be this:
1 + private_key(ee7) + public_key(aa3). For example: Convert 1ee7aa3 to MD5-Hash = 1ca3591360a252817c30a16b615b0afa (md5_type_hash)
You can create from this website: https://www.md5hashgenerator.com
Done, you can use marvel api now!

Error Using geopy library

I have the following question, I want to set up a routine to perform iterations inside a dataframe (pandas) to extract longitude and latitude data, after supplying the address using the 'geopy' library.
The routine I created was:
import time
from geopy.geocoders import GoogleV3
import os
arquivo = pd.ExcelFile('path')
df = arquivo.parse("Table1")
def set_proxy():
proxy_addr = 'http://{user}:{passwd}#{address}:{port}'.format(
user='usuario', passwd='senha',
address='IP', port=int('PORTA'))
os.environ['http_proxy'] = proxy_addr
os.environ['https_proxy'] = proxy_addr
def unset_proxy():
os.environ.pop('http_proxy')
os.environ.pop('https_proxy')
set_proxy()
geo_keys = ['AIzaSyBXkATWIrQyNX6T-VRa2gRmC9dJRoqzss0'] # API Google
geolocator = GoogleV3(api_key=geo_keys )
for index, row in df.iterrows():
location = geolocator.geocode(row['NO_LOGRADOURO'])
time.sleep(2)
lat=location.latitude
lon=location.longitude
timeout=10)
address = location.address
unset_proxy()
print(str(lat) + ', ' + str(lon))
The problem I'm having is that when I run the code the following error is thrown:
GeocoderQueryError: Your request was denied.
I tried the creation without passing the key to the google API, however, I get the following message.
KeyError: 'http_proxy'
and if I remove the unset_proxy () statement from within the for, the message I receive is:
GeocoderQuotaExceeded: The given key has gone over the requests limit in the 24 hour period or has submitted too many requests in too short a period of time.
But I only made 5 requests today, and I'm putting a 2-second sleep between requests. Should the period be longer?
Any idea?
api_key argument of the GoogleV3 class must be a string, not a list of strings (that's the cause of your first issue).
geopy doesn't guarantee the http_proxy/https_proxy env vars to be respected (especially the runtime modifications of the os.environ). The advised (by docs) usage of proxies is:
geolocator = GoogleV3(proxies={'http': proxy_addr, 'https': proxy_addr})
PS: Please don't ever post your API keys to the public. I suggest to revoke the key you've posted in the question and generate a new one, to prevent the possibility of it being abused by someone else.

Resources