Securely using a variable in URL in a Flask webapp - security

I am using Flask to make a webappthat will contain a variable within the URL (example below).
#app.route('/landingpage/<id>') # /landingpage/A
def landing_page(id):
# Storage of a hashed <id>...
My question relates to the variable within the URL which could contain confidential information, which should not be accessible to anyone other than the person entering the URL.
Would it be sufficient for the connection to be made over HTTPS to prevent anyone else accessing the variable prior to it being stored in a hashed format?

You can use UUID in url, flask have support for this type of url, and i recommend to use this lib flask-uuid to gerate the uuid to your client access
from flask_uuid import FlaskUUID
flask_uuid = FlaskUUID()
flask_uuid.init_app(app)
#app.route('/personalID')
def gerate_uuid():
return make_response({'uuid':uuid.uuid4()})
#app.route('/landingpage/<uuid:id>') # /landingpage/A
def landing_page(id):
return id # 'id' is a uuid.UUID instance

Related

Can't list bucket objects on Scaleway using boto3

I saw a few similar posts, but unfortunately none helped me.
I have an s3 bucket (on scaleway), and I'm trying to simply list all objects contained in that bucket, using boto3 s3 client as follow:
s3 = boto3.client('s3',
region_name=AWS_S3_REGION_NAME,
endpoint_url=AWS_S3_ENDPOINT_URL,
aws_access_key_id=AWS_ACCESS_KEY_ID,
aws_secret_access_key=AWS_SECRET_ACCESS_KEY
)
all_objects = s3.list_objects_v2(Bucket=AWS_STORAGE_BUCKET_NAME)
This simple piece of code responds with an error:
botocore.errorfactory.NoSuchKey: An error occurred (NoSuchKey) when calling the ListObjects operation: The specified key does not exist.
First, the error seems inapropriate to me since I'm not specifying any key to search. I also tried to pass a Prefix argument to this method to narrow down the search to a specific subdirectory, same error.
Second, I tried to achieve the same thing using boto3 Resource rather than Client, as follow:
session = boto3.Session(
region_name=AWS_S3_REGION_NAME,
aws_access_key_id=AWS_ACCESS_KEY_ID,
aws_secret_access_key=AWS_SECRET_ACCESS_KEY
)
resource = session.resource(
's3',
endpoint_url=AWS_S3_ENDPOINT_URL,
)
for bucket in resource.buckets.all():
print(bucket.name)
That code produces absolutely nothing. One weird thing that strikes me is that I don't pass the bucket_name anywhere here, which seems to be normal according to aws documentation
There's no chance that I misconfigured the client, since I'm able to use the put_object method perfectly with that same client. One strange though: when I want to put a file, I pass the whole path to put_object as Key (as I found it to be the way to go), but the object is inserted with the bucket name prepend to it. So let's say I call put_object(Key='/path/to/myfile.ext'), the object will end up to be /bucket-name/path/to/myfile.ext.
Is this strange behavior the key to my problem ? How can I investigate what's happening, or is there another way I could try to list bucket files ?
Thank you
EDIT: So, after logging the request that boto3 client is sending, I noticed that the bucket name is append to the url, so instead of requesting https://<bucket_name>.s3.<region>.<provider>/, it requests https://<bucket_name>.s3.<region>.<provider>/<bucket-name>/, which is leading to the NoSuchKey error.
I took a look into the botocore library, and I found this:
url = _urljoin(endpoint_url, r['url_path'], host_prefix)
in botocore.awsrequest line 252, where r['url_path'] contains /skichic-bucket?list-type=2. So from here, I should be able to easily patch the library core to make it work for me.
Plus, the Prefix argument is not working, whatever I pass into it I always receive the whole bucket content, but I guess I can easily patch this too.
Now it's not satisfying, since there's no issue related to this on github, I can't believe that the library contains such a bug that I'm the first one to encounter.
Does anyone can explain this whole mess ? >.<
For those who are facing the same issue, try changing your endpoint_url parameter in your boto3 client or resource instantiation from https://<bucket_name>.s3.<region>.<provider> to https://s3.<region>.<provider> ; i.e for Scaleway : https://s3.<region>.scw.cloud.
You can then set the Bucket parameter to select the bucket you want.
list_objects_v2(Bucket=<bucket_name>)
you can try this. you'll have to use your resource instead of my s3sr.
s3sr = resource('s3')
bucket = 'your-bucket'
prefix = 'your-prefix/' # if no prefix, pass ''
def get_keys_from_prefix(bucket, prefix):
'''gets list of keys for given bucket and prefix'''
keys_list = []
paginator = s3sr.meta.client.get_paginator('list_objects_v2')
# use Delimiter to limit search to that level of hierarchy
for page in paginator.paginate(Bucket=bucket, Prefix=prefix, Delimiter='/'):
keys = [content['Key'] for content in page.get('Contents')]
print('keys in page: ', len(keys))
keys_list.extend(keys)
return keys_list
keys_list = get_keys_from_prefix(bucket, prefix)
After looking more closely into things, I've found out that (a lot) of botocore services endpoints patterns starts with the bucket name. For example, here's the definition of the list_objects_v2 service:
"ListObjectsV2":{
"name":"ListObjectsV2",
"http":{
"method":"GET",
"requestUri":"/{Bucket}?list-type=2"
},
My guess is that in the standard implementation of AWS S3, there's a genericendpoint_url (which explains #jordanm comment) and the targeted bucket is reached through the endpoint.
Now, in the case of Scaleway, there's an endpoint_url for each bucket, with the bucket name contained in that url (e.g https://<bucket_name>.s3.<region>.<provider>), and any endpoint should directly starts with a bucket Key.
I made a fork of botocore where I rewrote every endpoint to remove the bucket name, if that can help someone in the future.
Thank's again to all contributors !

Function service_account.Credentials.from_service_account_info() not working

I'm writing an application based on GCP services and I need to access to an external project. I stored on my Firestore database the authentication file's informations of the other project I need to access to. I read this documentation and I tried to apply it but my code does not work. As the documentaion says, what I pass to the authentication method is a dictionary[str, str].
This is my code:
from googleapiclient import discovery
from google.oauth2 import service_account
from google.cloud import firestore
project_id = body['project_id']
user = body['user']
snap_id = body['snapshot_id']
debuggee_id = body['debuggee_id']
db = firestore.Client()
ref = db.collection(u'users').document(user).collection(u'projects').document(project_id)
if ref.get().exists:
service_account_info = ref.get().to_dict()
else:
return None, 411
credentials = service_account.Credentials.from_service_account_info(
service_account_info,
scopes=['https://www.googleapis.com/auth/cloud-platform'])
service = discovery.build('clouddebugger', 'v2', credentials=credentials)
body is just a dictionary containing all the informations of the other project. What I can't understand is why this doesn't work and instead using the method from_service_account_file it works.
The following code will give to that method the same informations of the previous code, but inside a json file instead of a dictionary. Maybe the order of the elements is different, but I think that it doesn't matter at all.
credentials = service_account.Credentials.from_service_account_file(
[PATH_TO_PROJECT_KEY_FILE],
scopes=['https://www.googleapis.com/auth/cloud-platform'])
Can you tell me what I'm doing wrong with the method from_service_account_info?
Problem solved. When I posted the question I manually inserted from the GCP Firestore Console all the info about the other project. Then I wrote the code to make it authomatically and it worked. Honestly I don't know why it didn't worked before, the informations put inside Firestore were the same and the format as well.

How to check the country origin of a url

I have a list of url's and want to find out their country of origin via python 3. And was wondering if anyone could help.
For example: quikr.com or kooora.com
Thanks
You can use
geoip2
either via a database, you need to download (which I did) or create an account.
Thanks #AlexVorndran for the initial help.
Code example:
import geoip2.database
import socket
ip = socket.gethostbyname('nike.com')
reader = geoip2.database.Reader('GeoLite2-Country_20190305/GeoLite2-Country.mmdb')
response = reader.country(ip)
response.country.iso_code # Results in 'US'

Get username from local session Telethon

I'm using telethon library to crawl some telegram channels. While crawling, i need to resolve many join links, usernames and channel ids. To resolve these items, i used method client.get_entity() but after a while telegram servers banned my crawler for resolving too many usernames. I searched around and found from this issue, i should use get_input_entity() instead of get_entity(). Actually telethon saves entities inside a local SQLite file and whenever a call to get_input_entity() is made, it first searches the local SQLite database, if no match found it then sends request to telegram servers. So far so good but i have two problems with this approach:
get_input_entity() just returns two attributes: ID and hash but there are other columns like username, phone and name in the SQLite database. I need a method to not just return ID and hash, but to return other columns too.
I need to control the number of resolve requests sent to telegram server but get_input_entity() sends request to telegram servers whenever founds no match in the local database. The problem is that i can't control this method when to request telegram servers. Actually i need a boolean argument for this method indicating whether or not the method should send a request to telegram servers when no match in the local database is found.
I read some of the telethon source codes, mainly get_input_entity() and wrote my own version of get_input_entity():
def my_own_get_input_entity(self, target, with_info: bool = False):
if self._client:
if target in ('me', 'self'):
return types.InputPeerSelf()
def get_info():
nonlocal self, result
res_id = 0
if isinstance(result, InputPeerChannel):
res_id = result.channel_id
elif isinstance(result, InputPeerChat):
res_id = result.chat_id
elif isinstance(result, InputPeerUser):
res_id = result.user_id
return self._sqlite_session._execute(
'select username, name from entities where id = ?', res_id
)
try:
result = self._client.session.get_input_entity(target)
info = get_info() if with_info else None
return result, info
except ValueError:
record_current_time()
try:
# when we are here, we are actually going to
# send request to telegram servers
if not check_if_appropriate_time_elapsed_from_last_telegram_request():
return None
result = self._client.get_input_entity(target)
info = get_info() if with_info else None
return result, info
except ChannelPrivateError:
pass
except ValueError:
pass
except Exception:
pass
But my code is somehow performance problematic because it makes redundant queries to SQLite database. For example, if the target is actually an entity inside the local database and with_info is True, it first queries the local database in line self._client.session.get_input_entity(target) and checks if with_info is True, then queries the database again to get username and name columns. In another situation, if target is not found inside the local database, calling self._client.get_input_entity(target) makes a redundant call to local database.
Knowing these performance issues, i delved deeper in telethon source codes but as i don't know much about asyncio, i couldn't write any better code than above.
Any ideas how to solve the problems?
client.session.get_input_entity will make no API call (it can't), and fails if there is no match in the local database, which is probably the behaviour you want.
You can, for now, access the client.session._conn private attribute. It's a sqlite3.Connection object so you can use that to make all the queries you want. Note that this is prone to breaking since you're accessing a private member although no changes are expected soon. Ideally, you should subclass the session file to suit your needs. See Session Files in the documentation.

What is the best way to encrypt stored data in web2py?

I need to encrypt data stored in web2py, more precisely passwords.
This is not about authentication, but more something in the line of a KeePass-like application.
I've seen that is included in web2py, but and M2Secret could easily do that. With M2Secret I can use this:
import m2secret
# Encrypt
secret = m2secret.Secret()
secret.encrypt('my data', 'my master password')
serialized = secret.serialize()
# Decrypt
secret = m2secret.Secret()
secret.deserialize(serialized)
data = secret.decrypt('my master password')
But I would have to include the M2Crypto library in my appliance.
Is there a way to do this with PyMe which is already included with web2py?
By default web2py stores passwords hashed using HMAC+SHA512 so there is nothing for you to do. It is better than the mechanism that you suggest because encryption is reversible while hashing is not. You can change this and do what you ask above but it would not be any more secure than using plaintext (since you would have to expose the encryption key in the app).
Anyway. Let's say you have a
db.define_table('mytable',Field('myfield'.'password'))
and you want to use m2secret. You would do:
class MyValidator:
def __init__(self,key): self.key=key
def __call__(self,value):
secret = m2secret.Secret()
secret.encrypt(value, self.key)
return secret.serialize()
def formatter(self,value):
secret = m2secret.Secret()
secret.deserialize(value)
return (secret.decrypt(self.key),None)
db.mytable.myfield.requires=MyValidator("master password")
In web2py validators are also two way filters.

Resources