I am trying to get cookies from the browser using aiohttp. From the docs and googling I have only found articles about setting cookies in aiohttp.
In flask I would get the cookies as simply as
cookie = request.cookies.get('name_of_cookie')
# do something with cookie
Is there a simple way to fetch the cookie from browser using aiohttp?
Is there a simple way to fetch the cookie from the browser using aiohttp?
Not sure about whether this is simple but there is a way:
import asyncio
import aiohttp
async def main():
urls = [
'http://httpbin.org/cookies/set?test=ok',
]
for url in urls:
async with aiohttp.ClientSession(cookie_jar=aiohttp.CookieJar()) as s:
async with s.get(url) as r:
print('JSON', await r.json())
cookies = s.cookie_jar.filter_cookies('http://httpbin.org')
for key, cookie in cookies.items():
print('Key: "%s", Value: "%s"' % (cookie.key, cookie.value))
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
The program generates the following output:
JSON: {'cookies': {'test': 'ok'}}
Key: "test", Value: "ok"
Example adapted from https://aiohttp.readthedocs.io/en/stable/client_advanced.html#custom-cookies + https://docs.aiohttp.org/en/stable/client_advanced.html#cookie-jar
Now if you want to do a request using a previously set cookie:
import asyncio
import aiohttp
url = 'http://example.com'
# Filtering for the cookie, saving it into a varibale
async with aiohttp.ClientSession(cookie_jar=aiohttp.CookieJar()) as s:
cookies = s.cookie_jar.filter_cookies('http://example.com')
for key, cookie in cookies.items():
if key == 'test':
cookie_value = cookie.value
# Using the cookie value to do anything you want:
# e.g. sending a weird request containing the cookie in the header instead.
headers = {"Authorization": "Basic f'{cookie_value}'"}
async with s.get(url, headers=headers) as r:
print(await r.json())
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
For testing urls containing a host part made up by an IP address use aiohttp.ClientSession(cookie_jar=aiohttp.CookieJar(unsafe=True)), according to https://github.com/aio-libs/aiohttp/issues/1183#issuecomment-247788489
Yes, the cookies are stored in request.cookies as a dict, just like in flask, so request.cookies.get('name_of_cookie') works the same.
In the examples section of the aiohttp repository there is a file, web_cookies.py that shows how to retrieve, set, and delete a cookie. Here's the section from that script that reads the cookies and returns it to the template as a preformatted string:
from pprint import pformat
from aiohttp import web
tmpl = '''\
<html>
<body>
Login<br/>
Logout<br/>
<pre>{}</pre>
</body>
</html>'''
async def root(request):
resp = web.Response(content_type='text/html')
resp.text = tmpl.format(pformat(request.cookies))
return resp
You can get the cookie value, domain, path etc, without having to loop thru all cookies.
s.cookie_jar._cookies
gives you all the cookies in a defaultdict with the domains as keys and their respective cookies as values. aiohttp uses SimpleCookie
So, to get the value of a cookie
s.cookie_jar._cookies.get("https://httpbin.org")["cookie_name"].value
for domain, path:
s.cookie_jar._cookies.get("https://httpbin.org")["cookie_name"]["domain"]
s.cookie_jar._cookies.get("https://httpbin.org")["cookie_name"]["path"]
more info can be found here: https://docs.python.org/3/library/http.cookies.html
Related
I'm trying to login to a website via python to print the info. So I don't have to keep logging into multiple accounts.
In the tutorial I followed, he just had a login and password, but this one has
Website Form Data
Does the _wp attributes change each login?
The code I use:
mffloginurl = ('https://myforexfunds.com/login-signup/')
mffsecureurl = ('https://myforexfunds.com/account-2')
payload = {
'log': '*****#gmail.com',
'pdw': '*****'
'''brandnestor_action':'login',
'_wpnonce': '9d1753c0b6',
'_wp_http_referer': '/login-signup/',
'_wpnonce': '9d1753c0b6',
'_wp_http_referer': '/login-signup/'''
}
r = requests.post(mffloginurl, data=payload)
print(r.text)
using the correct details of course, but it doesn't login.
I tried without the extra wordpress elements and also with them but it still just goes to the signin page.
python output
different site addresses, different login details
Yeah the nonce will change with every new visit to the page.
I would use request.session() so that it automatically stores session cookies and all that good stuff.
Do a session.GET('some_login_page.com')
Parse with the response content with BeautifulSoup to retrieve the nonce.
Then add that into the payload of your POST request when you login.
A very quick and dirty example:
import requests
from bs4 import BeautifulSoup as bs
email = 'test#email.com'
password = 'password1234'
url = 'https://myforexfunds.com/account-2/'
# Start a session
with requests.session() as session:
# Send a GET request to the login page
r = session.get(url)
# Check if the request was successful
if r.status_code != 200:
print("Get Request Failed")
# Parse the HTML content of the page
soup = bs(r.content, 'lxml')
# Extract the value of the nonce from the HTML
nonce = soup.find(id='woocommerce-login-nonce')['value']
# Set up the login form data
params ={
"username": email,
"password": password,
"woocommerce-login-nonce": nonce,
"_wp_http_referer": "/account-2/",
"login": "Log+in"
}
# Send a POST request with the login form data
r = session.post(url, params=params)
# Check if the request was successful
if r.status_code != 200:
print("Login Failed")
I'm having issues w/ authenticating to pkgs.org api, a token was produced, by support mentioned it needs to be passed as a cookie. I've never worked with cookies before.
import requests
import json
import base64
import urllib3
import sys
import re
import os
token=('super-secret')
#s = requests.Session()
head = {'Accept':'application/json'}
r = requests.get('https://api.pkgs.org/v1/distributions', auth=(token), headers=head)
print(r)
print(r.connection)
print(r.cookies)
I tried to use the request.session method, to handle the cookie, but i honestly don't know syntax on how to ever 1 create a cookie, let alone pass the cookie.
If I read the API documentation correctly you should set acces_token cookie:
import requests
token = "super-secret"
cookies = {"access_token": token}
headers = {"Accept": "application/json"}
r = requests.get(
"https://api.pkgs.org/v1/distributions", cookies=cookies, headers=headers
)
How do we post a GraphQL request through AWS AppSync using boto?
Ultimately I'm trying to mimic a mobile app accessing our stackless/cloudformation stack on AWS, but with python. Not javascript or amplify.
The primary pain point is authentication; I've tried a dozen different ways already. This the current one, which generates a "401" response with "UnauthorizedException" and "Permission denied", which is actually pretty good considering some of the other messages I've had. I'm now using the 'aws_requests_auth' library to do the signing part. I assume it authenticates me using the stored /.aws/credentials from my local environment, or does it?
I'm a little confused as to where and how cognito identities and pools will come into it. eg: say I wanted to mimic the sign-up sequence?
Anyways the code looks pretty straightforward; I just don't grok the authentication.
from aws_requests_auth.boto_utils import BotoAWSRequestsAuth
APPSYNC_API_KEY = 'inAppsyncSettings'
APPSYNC_API_ENDPOINT_URL = 'https://aaaaaaaaaaaavzbke.appsync-api.ap-southeast-2.amazonaws.com/graphql'
headers = {
'Content-Type': "application/graphql",
'x-api-key': APPSYNC_API_KEY,
'cache-control': "no-cache",
}
query = """{
GetUserSettingsByEmail(email: "john#washere"){
items {name, identity_id, invite_code}
}
}"""
def test_stuff():
# Use the library to generate auth headers.
auth = BotoAWSRequestsAuth(
aws_host='aaaaaaaaaaaavzbke.appsync-api.ap-southeast-2.amazonaws.com',
aws_region='ap-southeast-2',
aws_service='appsync')
# Create an http graphql request.
response = requests.post(
APPSYNC_API_ENDPOINT_URL,
json={'query': query},
auth=auth,
headers=headers)
print(response)
# this didn't work:
# response = requests.post(APPSYNC_API_ENDPOINT_URL, data=json.dumps({'query': query}), auth=auth, headers=headers)
Yields
{
"errors" : [ {
"errorType" : "UnauthorizedException",
"message" : "Permission denied"
} ]
}
It's quite simple--once you know. There are some things I didn't appreciate:
I've assumed IAM authentication (OpenID appended way below)
There are a number of ways for appsync to handle authentication. We're using IAM so that's what I need to deal with, yours might be different.
Boto doesn't come into it.
We want to issue a request like any regular punter, they don't use boto, and neither do we. Trawling the AWS boto docs was a waste of time.
Use the AWS4Auth library
We are going to send a regular http request to aws, so whilst we can use python requests they need to be authenticated--by attaching headers.
And, of course, AWS auth headers are special and different from all others.
You can try to work out how to do it
yourself, or you can go looking for someone else who has already done it: Aws_requests_auth, the one I started with, probably works just fine, but I have ended up with AWS4Auth. There are many others of dubious value; none endorsed or provided by Amazon (that I could find).
Specify appsync as the "service"
What service are we calling? I didn't find any examples of anyone doing this anywhere. All the examples are trivial S3 or EC2 or even EB which left uncertainty. Should we be talking to api-gateway service? Whatsmore, you feed this detail into the AWS4Auth routine, or authentication data. Obviously, in hindsight, the request is hitting Appsync, so it will be authenticated by Appsync, so specify "appsync" as the service when putting together the auth headers.
It comes together as:
import requests
from requests_aws4auth import AWS4Auth
# Use AWS4Auth to sign a requests session
session = requests.Session()
session.auth = AWS4Auth(
# An AWS 'ACCESS KEY' associated with an IAM user.
'AKxxxxxxxxxxxxxxx2A',
# The 'secret' that goes with the above access key.
'kwWxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxgEm',
# The region you want to access.
'ap-southeast-2',
# The service you want to access.
'appsync'
)
# As found in AWS Appsync under Settings for your endpoint.
APPSYNC_API_ENDPOINT_URL = 'https://nqxxxxxxxxxxxxxxxxxxxke'
'.appsync-api.ap-southeast-2.amazonaws.com/graphql'
# Use JSON format string for the query. It does not need reformatting.
query = """
query foo {
GetUserSettings (
identity_id: "ap-southeast-2:8xxxxxxb-7xx4-4xx4-8xx0-exxxxxxx2"
){
user_name, email, whatever
}}"""
# Now we can simply post the request...
response = session.request(
url=APPSYNC_API_ENDPOINT_URL,
method='POST',
json={'query': query}
)
print(response.text)
Which yields
# Your answer comes as a JSON formatted string in the text attribute, under data.
{"data":{"GetUserSettings":{"user_name":"0xxxxxxx3-9102-42f0-9874-1xxxxx7dxxx5"}}}
Getting credentials
To get rid of the hardcoded key/secret you can consume the local AWS ~/.aws/config and ~/.aws/credentials, and it is done this way...
# Use AWS4Auth to sign a requests session
session = requests.Session()
credentials = boto3.session.Session().get_credentials()
session.auth = AWS4Auth(
credentials.access_key,
credentials.secret_key,
boto3.session.Session().region_name,
'appsync',
session_token=credentials.token
)
...<as above>
This does seem to respect the environment variable AWS_PROFILE for assuming different roles.
Note that STS.get_session_token is not the way to do it, as it may try to assume a role from a role, depending where it keyword matched the AWS_PROFILE value. Labels in the credentials file will work because the keys are right there, but names found in the config file do not work, as that assumes a role already.
OpenID
In this scenario, all the complexity is transferred to the conversation with the openid connect provider. The hard stuff is all the auth hoops you jump through to get an access token, and thence using the refresh token to keep it alive. That is where all the real work lies.
Once you finally have an access token, assuming you have configured the "OpenID Connect" Authorization Mode in appsync, then you can, very simply, drop the access token into the header:
response = requests.post(
url="https://nc3xxxxxxxxxx123456zwjka.appsync-api.ap-southeast-2.amazonaws.com/graphql",
headers={"Authorization": ACCESS_TOKEN},
json={'query': "query foo{GetStuff{cat, dog, tree}}"}
)
You can set up an API key on the AppSync end and use the code below. This works for my case.
import requests
# establish a session with requests session
session = requests.Session()
# As found in AWS Appsync under Settings for your endpoint.
APPSYNC_API_ENDPOINT_URL = 'https://vxxxxxxxxxxxxxxxxxxy.appsync-api.ap-southeast-2.amazonaws.com/graphql'
# setup the query string (optional)
query = """query listItemsQuery {listItemsQuery {items {correlation_id, id, etc}}}"""
# Now we can simply post the request...
response = session.request(
url=APPSYNC_API_ENDPOINT_URL,
method='POST',
headers={'x-api-key': '<APIKEYFOUNDINAPPSYNCSETTINGS>'},
json={'query': query}
)
print(response.json()['data'])
Building off Joseph Warda's answer you can use the class below to send AppSync commands.
# fileName: AppSyncLibrary
import requests
class AppSync():
def __init__(self,data):
endpoint = data["endpoint"]
self.APPSYNC_API_ENDPOINT_URL = endpoint
self.api_key = data["api_key"]
self.session = requests.Session()
def graphql_operation(self,query,input_params):
response = self.session.request(
url=self.APPSYNC_API_ENDPOINT_URL,
method='POST',
headers={'x-api-key': self.api_key},
json={'query': query,'variables':{"input":input_params}}
)
return response.json()
For example in another file within the same directory:
from AppSyncLibrary import AppSync
APPSYNC_API_ENDPOINT_URL = {YOUR_APPSYNC_API_ENDPOINT}
APPSYNC_API_KEY = {YOUR_API_KEY}
init_params = {"endpoint":APPSYNC_API_ENDPOINT_URL,"api_key":APPSYNC_API_KEY}
app_sync = AppSync(init_params)
mutation = """mutation CreatePost($input: CreatePostInput!) {
createPost(input: $input) {
id
content
}
}
"""
input_params = {
"content":"My first post"
}
response = app_sync.graphql_operation(mutation,input_params)
print(response)
Note: This requires you to activate API access for your AppSync API. Check this AWS post for more details.
graphql-python/gql supports AWS AppSync since version 3.0.0rc0.
It supports queries, mutation and even subscriptions on the realtime endpoint.
The documentation is available here
Here is an example of a mutation using the API Key authentication:
import asyncio
import os
import sys
from urllib.parse import urlparse
from gql import Client, gql
from gql.transport.aiohttp import AIOHTTPTransport
from gql.transport.appsync_auth import AppSyncApiKeyAuthentication
# Uncomment the following lines to enable debug output
# import logging
# logging.basicConfig(level=logging.DEBUG)
async def main():
# Should look like:
# https://XXXXXXXXXXXXXXXXXXXXXXXXXX.appsync-api.REGION.amazonaws.com/graphql
url = os.environ.get("AWS_GRAPHQL_API_ENDPOINT")
api_key = os.environ.get("AWS_GRAPHQL_API_KEY")
if url is None or api_key is None:
print("Missing environment variables")
sys.exit()
# Extract host from url
host = str(urlparse(url).netloc)
auth = AppSyncApiKeyAuthentication(host=host, api_key=api_key)
transport = AIOHTTPTransport(url=url, auth=auth)
async with Client(
transport=transport, fetch_schema_from_transport=False,
) as session:
query = gql(
"""
mutation createMessage($message: String!) {
createMessage(input: {message: $message}) {
id
message
createdAt
}
}"""
)
variable_values = {"message": "Hello world!"}
result = await session.execute(query, variable_values=variable_values)
print(result)
asyncio.run(main())
I am unable to add a comment due to low rep, but I just want to add that I tried the accepted answer and it didn't work. I was getting an error saying my session_token is invalid. Probably because I was using AWS Lambda.
I got it to work pretty much exactly, but by adding to the session token parameter of the aws4auth object. Here's the full piece:
import requests
import os
from requests_aws4auth import AWS4Auth
def AppsyncHandler(event, context):
# These are env vars that are always present in an AWS Lambda function
# If not using AWS Lambda, you'll need to add them manually to your env.
access_id = os.environ.get("AWS_ACCESS_KEY_ID")
secret_key = os.environ.get("AWS_SECRET_ACCESS_KEY")
session_token = os.environ.get("AWS_SESSION_TOKEN")
region = os.environ.get("AWS_REGION")
# Your AppSync Endpoint
api_endpoint = os.environ.get("AppsyncConnectionString")
resource = "appsync"
session = requests.Session()
session.auth = AWS4Auth(access_id,
secret_key,
region,
resource,
session_token=session_token)
The rest is the same.
Hope this Helps Everyone
import requests
import json
import os
from dotenv import load_dotenv
load_dotenv(".env")
class AppSync(object):
def __init__(self,data):
endpoint = data["endpoint"]
self.APPSYNC_API_ENDPOINT_URL = endpoint
self.api_key = data["api_key"]
self.session = requests.Session()
def graphql_operation(self,query,input_params):
response = self.session.request(
url=self.APPSYNC_API_ENDPOINT_URL,
method='POST',
headers={'x-api-key': self.api_key},
json={'query': query,'variables':{"input":input_params}}
)
return response.json()
def main():
APPSYNC_API_ENDPOINT_URL = os.getenv("APPSYNC_API_ENDPOINT_URL")
APPSYNC_API_KEY = os.getenv("APPSYNC_API_KEY")
init_params = {"endpoint":APPSYNC_API_ENDPOINT_URL,"api_key":APPSYNC_API_KEY}
app_sync = AppSync(init_params)
mutation = """
query MyQuery {
getAccountId(id: "5ca4bbc7a2dd94ee58162393") {
_id
account_id
limit
products
}
}
"""
input_params = {}
response = app_sync.graphql_operation(mutation,input_params)
print(json.dumps(response , indent=3))
main()
I'm logged in on some page in Firefox and I want to take the cookie and try to browse webpage with python-requests. Problem is that after importing cookie to the requests session nothing happen (like there is no cookie at all). Structure of the cookie made by requests differ from the one from Firefox as well.
Is such it possible to load FF cookie and use it in requests session?
My code so far:
import sys
import sqlite3
import http.cookiejar as cookielib
import requests
from requests.utils import dict_from_cookiejar
def get_cookies(final_cookie, firefox_cookies):
con = sqlite3.connect(firefox_cookies)
cur = con.cursor()
cur.execute("SELECT host, path, isSecure, expiry, name, value FROM moz_cookies")
for item in cur.fetchall():
if item[0].find("mydomain.com") == -1:
continue
c = cookielib.Cookie(0, item[4], item[5],
None, False,
item[0], item[0].startswith('.'), item[0].startswith('.'),
item[1], False,
item[2],
item[3], item[3]=="",
None, None, {})
final_cookie.set_cookie(c)
cookie = cookielib.CookieJar()
input_file = ~/.mozilla/firefox/myprofile.default/cookies.sqlite
get_cookies(cookie, input_file)
#print cookie given from firefox
cookies = dict_from_cookiejar(cookie)
for key, value in cookies.items():
print(key, value)
s = requests.Session()
payload = {
"lang" : "en",
'destination': '/auth',
'credential_0': sys.argv[1],
'credential_1': sys.argv[2],
'credential_2': '86400',
}
r = s.get("mydomain.com/login", data = payload)
#print cookie from requests
cookies = dict_from_cookiejar(s.cookies)
for key, value in cookies.items():
print(key, value)
Structure of cookies from firefox is:
_gid GA1.3.2145214.241324
_ga GA1.3.125598754.422212
_gat_is4u 1
Structure of cookies from requests is:
UISTestAuth tesskMpA8JJ23V43a%2FoFtdesrtsszpw
After all, when trying to assign cookies from FF to session.cookies, requests works as I import nothing.
It looks like there are two type of cookies in the Firefox - request and response. It could be seen while Page inspector > Network > login (post) > Cookies:
Response cookies:
UISAuth
httpOnly true
path /
secure true
value tesskMpA8JJ23V43a%2FoFtdesrtsszpw
Request cookies:
_ga GA1.3.125598754.422212
_gat_is4u 1
_gid GA1.3.2145214.241324
The request cookies are stored in the cookies.sqlite file in the
~/.mozilla/firefox/*.default/cookies.sqlite
and can be load to the python object in more ways, for example:
import sqlite3
import http.cookiejar
def get_cookies(cj, ff_cookies):
con = sqlite3.connect(ff_cookies)
cur = con.cursor()
cur.execute("SELECT host, path, isSecure, expiry, name, value FROM moz_cookies")
for item in cur.fetchall():
c = cookielib.Cookie(0, item[4], item[5],
None, False,
item[0], item[0].startswith('.'), item[0].startswith('.'),
item[1], False,
item[2],
item[3], item[3]=="",
None, None, {})
print c
cj.set_cookie(c)
where cj is CookieJar object and ff_cookies is path to Firefox cookies.sqlite. Taken from this page.
The whole code to load cookies and import to the python requests using session would looks like:
import requests
import sys
cj = http.cookiejar.CookieJar()
ff_cookies = sys.argv[1] #pass path to the cookies.sqlite as an argument to the script
get_cookies(cj, ff_cookies)
s = requests.Session()
s.cookies = cj
Response cookie is basically session ID, which usualy expires at the end of the session (or some timeout), so they are not stored.
There is a package on PyPi, browser-cookie3, which does exactly this.
import browser_cookie3
import requests
cookiejar = browser_cookie3.firefox(domain_name='signed-in-website.tld')
resp = requests.get('https://signed-in-website.tld/path/', cookies=cookiejar)
print(resp.content)
browser_cookie3.firefox() retrieves Firefox cookies as a cookiejar with a domain name as optional argument.
I've got a list of ~3,000 URLs I'm trying to create Google shortened links of, the idea is this CSV has a list of links and I want my code to output the shortened links in the column next to the original URLs.
I've been trying to modify the code found on this site here but I'm not skilled enough to get it to work.
Here's my code (I would not normally post an API key but the original person who asked this already posted it publicly on this site) :
import json
import pandas as pd
df = pd.read_csv('Links_Test.csv')
def shorternUrl(my_URL):
API_KEY = "AIzaSyCvhcU63u5OTnUsdYaCFtDkcutNm6lIEpw"
apiUrl = 'https://www.googleapis.com/urlshortener/v1/url'
longUrl = my_URL
headers = {"Content-type": "application/json"}
data = {"longUrl": longUrl}
h = httplib2.Http('.cache')
headers, response = h.request(apiUrl, "POST", json.dumps(data), headers)
return response
for url in df['URL']:
x = shorternUrl(url)
# Then I want it to write x into the column next to the original URL
But I only get errors at this point, before I even started figuring out how to write the new URLs to the CSV file.
Here's some sample data:
URL
www.apple.com
www.google.com
www.microsoft.com
www.linux.org
Thank you for your help,
Me
I think the issue is that you didnot include the API key in the request. By the way, the certifi package allows you to secure a connection to a link. You can get it using pip install certifi or pip urllib3[secure].
Here I create my own API key, so you might want to replace it with yours.
from urllib3 import PoolManager
import json
import certifi
sampleURL = 'http://www.apple.com'
APIkey = 'AIzaSyD8F41CL3nJBpEf0avqdQELKO2n962VXpA'
APIurl = 'https://www.googleapis.com/urlshortener/v1/url?key=' + APIkey
http = PoolManager(cert_reqs = 'CERT_REQUIRED', ca_certs=certifi.where())
def shortenURL(url):
data = {'key': APIkey, 'longUrl' : url}
response = http.request("POST", APIurl, body=json.dumps(data), headers= {'Content-Type' : 'application/json'}).data.decode('utf-8')
r = json.loads(response)
return (r['id'])
The decoding part converts the response object into a string so that we can convert it to a JSON and retrieve data.
From there on, you can store the data into another column and so on.
For the sampleUrl, I got back https(goo.gl/nujb) from the function.
I found a solution here:
https://pypi.python.org/pypi/pyshorteners
Example copied from linked page:
from pyshorteners import Shortener
url = 'http://www.google.com'
api_key = 'YOUR_API_KEY'
shortener = Shortener('Google', api_key=api_key)
print "My short url is {}".format(shortener.short(url))