Cannot login to website using requests module (Python version 3.5.1) - python-3.x

I am trying to access a website to scrape some information, however I am having trouble posting login information through Python. Here is my code so far:
import requests
c = requests.Session()
url = 'https://subscriber.hoovers.com/H/login/login.html'
USERNAME = 'user'
PASSWORD = 'pass'
c.get(url)
csrftoken = c.cookies['csrftoken']
login_data = dict(j_username=USERNAME, j_password=PASSWORD,
csrfmiddlewaretoken=csrftoken, next='/')
c.post(url, data=login_data, headers=dict(Referer=url))
page = c.get('http://subscriber.hoovers.com/H/home/index.html')
print(page.content)
Here is the form data from the post login page:
j_username:user
j_password:pass
OWASP_CSRFTOKEN:8N0Z-TND5-NV71-C4N4-43BK-B13S-A1MO-NZQC
OWASP_CSRFTOKEN:8N0Z-TND5-NV71-C4N4-43BK-B13S-A1MO-NZQC
Here is the error I receive:
Traceback (most recent call last):
File "C:/Users/10023539/Desktop/pyscripts/webscraper ex.py", line 9, in <module>
csrftoken = c.cookies['csrftoken']
File "C:\Program Files (x86)\Python35-32\Lib\site-packages\requests\cookies.py", line 293, in __getitem__
return self._find_no_duplicates(name)
File "C:\Program Files (x86)\Python35-32\Lib\site-packages\requests\cookies.py", line 351, in _find_no_duplicates
raise KeyError('name=%r, domain=%r, path=%r' % (name, domain, path))
KeyError: "name='csrftoken', domain=None, path=None"
I believe the issue has something to do with the 'OWASP_CSRFTOKEN' label? I haven't found any solutions for this specific CSRF name anywhere online. I've also tried removing the c.cookies method and manually typing in the CSRF code into the csrfmiddlewaretoken argument. I've also tried changing the referal URL around, still getting the same error.
Any assistance would be greatly appreciated.

First of all you catch KeyError exception, this mean that cookies dictionary have no key csrftoken.
So you need explore your response for find right CSRF token cookie name.
For example you can print all cookies:
for key in c.cookies.keys():
print('%s: %s' % (key, c.cookies[key]))
UPD: Actually your response have no CSRF cookie.
you need look token in your c.text with pyquery
<input type="hidden" name="OWASP_CSRFTOKEN" class="csrfClass" value="X48L-NEYI-CG18-SJOD-VDW9-FGEB-7WIT-88P4">

Related

AWS secret manager time out sometimes

I am fetching a secret from secret manager on a lambda. The request fails sometimes. Which is totally strange, it is working fine and couple of hours later I check and I am getting time out.
def get_credentials(self):
"""Retrieve credentials from the Secrets Manager service."""
boto_config = BotoConfig(connect_timeout=3, retries={"max_attempts": 3})
secrets_client = self.boto_session.client(
service_name="secretsmanager",
region_name=self.boto_session.region_name,
config=boto_config,
)
secret_value = secrets_client.get_secret_value(SecretId=self._secret_name)
secret = secret_value["SecretString"]
I try to debug the lambda and later seems to be working again, without any change, those state changes happen in hours. Any hint why that could happen?
Traceback (most recent call last):
File "/opt/python/botocore/endpoint.py", line 249, in _do_get_response
http_response = self._send(request)
File "/opt/python/botocore/endpoint.py", line 321, in _send
return self.http_session.send(request)
File "/opt/python/botocore/httpsession.py", line 438, in send
raise ConnectTimeoutError(endpoint_url=request.url, error=e)
botocore.exceptions.ConnectTimeoutError: Connect timeout on endpoint URL: "https://secretsmanager.eu-central-1.amazonaws.com/"
You are using the legacy retry mode (is the default one in boto3), which has very limited functionality as it only works for a very limited number of errors.
You should try changing your retry strategy to something like Standard retry mode or Adaptative. In that case your code would look like:
from botocore.config import Config as BotoConfig
boto_config = BotoConfig(
connect_timeout=3,
retries={
"max_attempts": 3,
"mode":"standard"
}
)
secrets_client = self.boto_session.client(
service_name="secretsmanager",
region_name=self.boto_session.region_name,
config=boto_config,
)

Why importing data to Zoho Analytics API causes error?

My goal is to write a script in Python3 that will push data in existing table in Zoho Analytics, the script will be used by a scheduler once a week.
What I have tried to far:
I can successfully import some data using cURL commands. Like so
curl -X POST \ 'https://analyticsapi.zoho.com/api/OwnerEmail/Workspace/TableName?ZOHO_ACTION=IMPORT& ZOHO_OUTPUT_FORMAT=JSON&ZOHO_ERROR_FORMAT=JSON&ZOHO_API_VERSION=1.0&ZOHO_IMPORT_TYPE=APPEND&ZOHO_AUTO_IDENTIFY=True&ZOHO_ON_IMPORT_ERROR=ABORT&ZOHO_CREATE_TABLE=False' \ -H 'Authorization: Zoho-oauthtoken *******' \ -H 'content-type: multipart/form-data; boundary=----WebKitFormBoundary7MA4YWxkTrZu0gW' \ -F ZOHO_FILE='path_to_csv'
What I found out that the ReportClient provided by Zoho Analytics team
Zoho Report Client for Python is not compatible with Python 3. Hence, I installed a wrapped for this ReportClient from [here] (https://pypi.org/project/zoho-analytics-connector).
Following sample examples from Zoho website and tests in github of wrapper for Zoho in Python3, I implement something like this:
Have a class to keep my ENV variables
import os
from zoho_analytics_connector.report_client import ReportClient, ServerError
from zoho_analytics_connector.enhanced_report_client import EnhancedZohoAnalyticsClient
class ZohoTracking:
LOGINEMAILID = os.getenv("ZOHOANALYTICS_LOGINEMAIL")
REFRESHTOKEN = os.getenv("ZOHOANALYTICS_REFRESHTOKEN")
CLIENTID = os.getenv("ZOHOANALYTICS_CLIENTID")
CLIENTSECRET = os.getenv("ZOHOANALYTICS_CLIENTSECRET")
DATABASENAME = os.getenv("ZOHOANALYTICS_DATABASENAME")
OAUTH = True
TABLENAME = "My Table"
Instantiate the Client Class
def get_enhanced_zoho_analytics_client(self) -> EnhancedZohoAnalyticsClient:
assert (not self.OAUTH and self.AUTHTOKEN) or (self.OAUTH and self.REFRESHTOKEN)
rc = EnhancedZohoAnalyticsClient(
// Just setting email, token, etc using class above
...
)
return rc```
Then have a method to upload data to existing table, the data_upload() function has the problem.
def enhanced_data_upload(self):
enhanced_client = self.get_enhanced_zoho_analytics_client()
try:
with open("./import/tracking3.csv", "r") as f:
import_content = f.read()
print(type(import_content))
except Exception as e:
print(f"Error:Check if file exists in the import directory {str(e)}")
return
res = enhanced_client.data_upload(import_content=import_content, table_name=ZohoTracking.TABLENAME)
assert res
Traceback (most recent call last):
File "push2zoho.py", line 106, in <module>
sample.enhanced_data_upload()
File "push2zoho.py", line 100, in enhanced_data_upload
res = enhanced_client.data_upload(import_content=import_content, table_name=ZohoTracking.TABLENAME)
File "/Users/.../zoho_analytics_connector/enhanced_report_client.py", line 99, in data_upload
matching_columns=matching_columns)
File "/Users/.../site-packages/zoho_analytics_connector/report_client.py", line 464, in importData_v2
r=self.__sendRequest(url=url,httpMethod="POST",payLoad=payload,action="IMPORT",callBackData=None)
File "/Users/.../zoho_analytics_connector/report_client.py", line 165, in __sendRequest
raise ServerError(respObj)
File "/Users/.../zoho_analytics_connector/report_client.py", line 1830, in __init__
contHeader = urlResp.headers["Content-Type"]
TypeError: 'NoneType' object is not subscriptable
That is the error I receive. What am I missing in this puzzle? Help is appreciated
In Feb 2021 I changed this inherited Zoho code in my library.
Now it is:
contHeader = urlResp.headers.get("Content-Type",None)
which avoids the final exception you had.

Trouble sending a batch create entity request in dialogflow

I have defined the following function. The purpose is to make batch create entity request with dialogflow client. I am using this method after sending many individual tests did not scale well.
The problem seems to be the line that defines EntityType. Seems like "entityType" is not valid but that is what is in the dialogflow v2 documentation which is the current version I am using.
Any ideas on what the issue is?
def create_batch_entity_types(self):
client = self.get_entity_client()
print(DialogFlowClient.batch_list)
EntityType = {
"entityTypes": DialogFlowClient.batch_list
}
response = client.batch_update_entity_types(parent=AGENT_PATH, entity_type_batch_inline=EntityType)
def callback(operation_future):
# Handle result.
result = operation_future.result()
print(result)
response.add_done_callback(callback)
After running the function I received this error
Traceback (most recent call last):
File "df_client.py", line 540, in <module>
create_entity_types_from_database()
File "df_client.py", line 426, in create_entity_types_from_database
df.create_batch_entity_types()
File "/Users/andrewflorial/Documents/PROJECTS/curlbot/dialogflow/dialogflow_accessor.py", line 99, in create_batch_entity_types
response = client.batch_update_entity_types(parent=AGENT_PATH, entity_type_batch_inline=EntityType)
File "/Users/andrewflorial/Documents/PROJECTS/curlbot/venv/lib/python3.7/site-packages/dialogflow_v2/gapic/entity_types_client.py", line 767, in batch_update_entity_types
update_mask=update_mask,
ValueError: Protocol message EntityTypeBatch has no "entityTypes" field.
The argument for entity_type_batch_inline must have the same form as EntityTypeBatch.
Look how that type looks like: https://dialogflow-python-client-v2.readthedocs.io/en/latest/gapic/v2/types.html#dialogflow_v2.types.EntityTypeBatch
It has to have entity_types field, not entityTypes.

Grabbing files from Microsoft Teams using Python

Teams seems to lack any native way of mirroring files to a shared directory. I'm Trying use Python (or another language but python preferred!) to either:
a. Directly pull from microsoft teams into memory using Python to process with Pandas
b. Copy files from teams into a shared network folder (which Python could then read in)
I found this but can't get it to work with teams - the teams URLs don't look anything like these do. How to read SharePoint Online (Office365) Excel files in Python with Work or School Account?
It seems close to what I want to do though. I also found "pymsteams" on PyPi repository. https://pypi.org/project/pymsteams/ which just seems to let you send messages to Teams and nothing else? unless I misunderstand something.
https://pypi.org/project/Office365-REST-Python-Client/
https://pypi.org/project/pymsteams/
from office365.runtime.auth.authentication_context
import AuthenticationContext
from office365.sharepoint.client_context import ClientContext
from office365.sharepoint.file import File
url = 'https://teams.microsoft.com/l/file'
username = 'myusername'
password = 'mypassword'
relative_url ='myurl'
ctx_auth = AuthenticationContext(url)
ctx_auth.acquire_token_for_user(username, password)
Trying to run the above code gives
AttributeError: 'NoneType' object has no attribute 'text'
Full stack trace:
runfile('H:/repos/foo/untitled0.py', wdir='H:/repos/foo')
Traceback (most recent call last):
File "<ipython-input-35-314ab7dc63c9>", line 1, in <module>
runfile('H:/repos/foo/untitled0.py', wdir='H:/foo/image_ai')
File "C:\ProgramData\Anaconda3\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 786, in runfile
execfile(filename, namespace)
File "C:\ProgramData\Anaconda3\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 110, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)
File "H:/repos/image_ai/untitled0.py", line 10, in <module>
ctx_auth.acquire_token_for_user(username, password)
File "C:\ProgramData\Anaconda3\lib\site-packages\office365\runtime\auth\authentication_context.py", line 18, in acquire_token_for_user
return self.provider.acquire_token()
File "C:\ProgramData\Anaconda3\lib\site-packages\office365\runtime\auth\saml_token_provider.py", line 57, in acquire_token
self.acquire_service_token(options)
File "C:\ProgramData\Anaconda3\lib\site-packages\office365\runtime\auth\saml_token_provider.py", line 88, in acquire_service_token
token = self.process_service_token_response(response)
File "C:\ProgramData\Anaconda3\lib\site-packages\office365\runtime\auth\saml_token_provider.py", line 119, in process_service_token_response
return token.text
AttributeError: 'NoneType' object has no attribute 'text'
I've managed to get it working using the Office-365-REST-Python-Client you linked.
If you're using SharePoint Online, you'll need to get an App-Only principle set up and connect using the acquire_token_for_app function instead of acquire_token_for_user, then passing a client_id and client_secret instead of username & password.
from office365.runtime.auth.authentication_context import AuthenticationContext
from office365.sharepoint.client_context import ClientContext
from office365.sharepoint.file import File
client_id = 'yourclientid'
client_secret = 'yourclientsecret'
url = 'https://yoursharepointsite.com/teams/yourteam'
relative_url = '/teams/yourteam/Shared%20Documents/yourteamschannel/yourdoc.extension'
ctx_auth = AuthenticationContext(url)
if ctx_auth.acquire_token_for_app(client_id, client_secret):
ctx = ClientContext(url, ctx_auth)
with open(filename, 'wb') as output_file:
response = File.open_binary(ctx, relative_url)
output_file.write(response.content)
else:
print(ctx_auth.get_last_error())
This should download your file to a local drive (specified via the filename variable) and you can then load into pandas etc to process

Twisted upload muliple files key error

I'm attempting to write a Twisted Webserver in Python 3.6 that can upload multiple files but being fairly new to both Python and things Web I have ran into an issue I don't understand and I also do not find any good examples relating to multiple file upload.
With the following code I get
from twisted.web import server, resource
from twisted.internet import reactor, endpoints
import cgi
class Counter(resource.Resource):
isLeaf = True
numberRequests = 0
def render_GET(self, request):
print("GET " + str(self.numberRequests))
self.numberRequests += 1
# request.setHeader(b"content-type", b"text/plain")
# content = u"I am request #{}\n".format(self.numberRequests)
content = """<html>
<body>
<form enctype="multipart/form-data" method="POST">
Text: <input name="text1" type="text" /><br />
File: <input name="file1" type="file" multiple /><br />
<input type="submit" />
</form>
</body>
</html>"""
print(request.uri)
return content.encode("ascii")
def render_POST(selfself, request):
postheaders = request.getAllHeaders()
try:
postfile = cgi.FieldStorage(
fp=request.content,
headers=postheaders,
environ={'REQUEST_METHOD': 'POST',
# 'CONTENT_TYPE': postheaders['content-type'], Gives builtins.KeyError: 'content-type'
}
)
except Exception as e:
print('something went wrong: ' + str(e))
filename = postfile["file"].filename #file1 tag also does not work
print(filename)
file = request.args["file"][0] #file1 tag also does not work
endpoints.serverFromString(reactor, "tcp:1234").listen(server.Site(Counter()))
reactor.run()
Error log
C:\Users\theuser\AppData\Local\conda\conda\envs\py36\python.exe C:/Users/theuser/PycharmProjects/myproject/twweb.py
GET 0
b'/'
# My comment POST starts here
Unhandled Error
Traceback (most recent call last):
File "C:\Users\theuser\AppData\Local\conda\conda\envs\py36\lib\site-packages\twisted\web\http.py", line 1614, in dataReceived
finishCallback(data[contentLength:])
File "C:\Users\theuser\AppData\Local\conda\conda\envs\py36\lib\site-packages\twisted\web\http.py", line 2029, in _finishRequestBody
self.allContentReceived()
File "C:\Users\theuser\AppData\Local\conda\conda\envs\py36\lib\site-packages\twisted\web\http.py", line 2104, in allContentReceived
req.requestReceived(command, path, version)
File "C:\Users\theuser\AppData\Local\conda\conda\envs\py36\lib\site-packages\twisted\web\http.py", line 866, in requestReceived
self.process()
--- <exception caught here> ---
File "C:\Users\theuser\AppData\Local\conda\conda\envs\py36\lib\site-packages\twisted\web\server.py", line 195, in process
self.render(resrc)
File "C:\Users\theuser\AppData\Local\conda\conda\envs\py36\lib\site-packages\twisted\web\server.py", line 255, in render
body = resrc.render(self)
File "C:\Users\theuser\AppData\Local\conda\conda\envs\py36\lib\site-packages\twisted\web\resource.py", line 250, in render
return m(request)
File "C:/Users/theuser/PycharmProjects/myproject/twweb.py", line 42, in render_POST
filename = postfile["file"].filename #file1 tag also does not work
File "C:\Users\theuser\AppData\Local\conda\conda\envs\py36\lib\cgi.py", line 604, in __getitem__
raise KeyError(key)
builtins.KeyError: 'file'
I don't understand how to get hold of the file or files so I can save them uploaded after the form Submit in render_POST. This post SO appears to not have this problem. The final app should be able to do this asynchronous for multiple users but before that I would be happy to get just a simple app working.
Using conda on Windows 10 with Python 3.6
FieldStorage in Python 3+ returns a dict with keys as bytes, not strings. So you must access keys like this:
postfile[ b"file" ]
Notice that the key is appended with b"". It's a tad bit confusing if you're new to Python and are unaware of the changes between Python 3 and 2 strings.
Also I answered a similar question a while back but was unable to get it working properly on Python 3.4 but I can't remember what exactly didn't work. Hopefully you don't run into any issues using 3.6.

Resources