I am trying to sign in to a website using RoboBrowser and I am stuck with a error message.
My code:
from robobrowser import RoboBrowser
browser = RoboBrowser()
def login():
browser.open('https://www.kijiji.ca/t-login.html')
form = browser.get_form(id="login-form")
form.fields["LoginEmailOrNickname"].value = "an_email_address"
form.fields["login-password"].value = "a_password"
form.fields["login-rememberMe"].value = "true"
browser.submit_form(form)
login()
The error message:
Traceback (most recent call last):
File "/home/rojaslee/Desktop/kijiji_poster/kijiji_poster.py", line 16, in <module>
login()
File "/home/rojaslee/Desktop/kijiji_poster/kijiji_poster.py", line 11, in login
form.fields["LoginEmailOrNickname"].value = ["an_email_address"]
File "/usr/local/lib/python3.4/dist-packages/werkzeug/datastructures.py", line 744, in __getitem__
raise exceptions.BadRequestKeyError(key)
werkzeug.exceptions.BadRequestKeyError: 400: Bad Request
Old thread, but I've been struggling with the same problem. This worked for me:
from robobrowser import Robobrowser
browser = RoboBrowser
def login():
browser.open('https://www.kijiji.ca/t-login.html')
form = browser.get_form(id="login-form")
form["EmailOrNickname"].value = "an_email_address"
form["password"].value = "a_password"
form["rememberMe"].value = "checked"
browser.submit_form(form)
The HTML code from the web site you want to log in is as follows:
<section>
<label for="LoginEmailOrNickname">Email Address or Nickname:</label>
<input id="LoginEmailOrNickname" name="emailOrNickname" req="req" type="text" value="" maxlength="128"><span class="field-message" data-for="LoginEmailOrNickname"></span>
</section>
<section>
<label for="login-password">Password:</label>
<input id="login-password" name="password" req="req" type="password" value="" maxlength="64"><span class="field-message" data-for="login-password"></span>
<a id="LoginForgottenPassword" href="/t-forgot-password.html">Forgot your password?</a>
</section>
To put a value on the form fields you have to get the name attribute, not the id.
This code should work:
form.fields["emailOrNickname"].value = "an_email_address"
form.fields["password"].value = "a_password"
form.fields["rememberMe"].value = "true"
If you need to get the fields of the form you can print them:
print form.fields
But have you tried using a 'try and except'?
I was able to get around this error using:
try:
form['userid'] = 'my_username'
form['password'] = 'my_password'
except Exception:
pass
I used a different website than you so make sure you input the correct form fields.
let me know if this helps!
Related
I am learning Selenium using Python. I have tried to check the button status - whether enabled or not? For that I have used mercury demo site and wrote the below code. I am getting
Unable to locate element: input[value=roundtrip]
Code which I have used:
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import time
driver = webdriver.Firefox(executable_path="C:\\Seleinum_Python\\WedDriver\\geckodriver.exe")
driver.get("http://newtours.demoaut.com/")
# Whether the user name & password enabled and displayed
user_ele = driver.find_element_by_name("userName") # <input type="text" name="userName" size="10">
print(user_ele.is_displayed()) # Return True /False based on element status
print(user_ele.is_enabled())
pwd_ele = driver.find_element_by_name("password") #<input type="password" name="password" size="10">
print(pwd_ele.is_displayed())
print(pwd_ele.is_enabled())
user_ele.send_keys("mercury")
pwd_ele.send_keys("mercury")
driver.find_element_by_name("login").click()
roundtrip_radio= driver.find_element_by_css_selector("input[value=roundtrip]") #<input type="radio" name="tripType" value="roundtrip" checked="">
print(roundtrip_radio.is_selected())
oneway_radio= driver.find_element_by_css_selector("input[value=oneway]")
print(oneway_radio.is_selected())
driver.close()
I even tried with below combination but still I am getting the same element not found issue.
roundtrip_radio= driver.find_element_by_css_selector("input[value='roundtrip']")
roundtrip_radio= driver.find_element_by_css_selector('input[value="roundtrip"]')
roundtrip_radio= driver.find_element_by_css_selector("//input[value='roundtrip']")
roundtrip_radio= driver.find_element_by_css_selector("input[value='roundtrip']")
roundtrip_radio= driver.find_element_by_css_selector('input[name="tripType"][value="roundtrip"]')
In your css selector you can use :checked to identify selected elements.
for example :-
"input[type=radio]:checked"
.Running my Selenium Script
I got the error:
**selenium.common.exceptions.ElementNotInteractableException: Message: element not interactable**
What am I doing wrong?
CODE:
oCheckBox = browser.find_element_by_css_selector("input[value='para-mim']")
oCheckBox.click()
HTML
<input type="radio" name="para-quem" id="para-mim" value="para-mim">
Try this:
browser.find_element_by_xpath("//option[#value='para-mim']").click()
Try use .execute_script:
oCheckBox = browser.find_element_by_css_selector("input[value='para-mim']")
browser.execute_script("arguments[0].click();", oCheckBox)
Or use ActionChains:
from selenium.webdriver import ActionChains
oCheckBox = browser.find_element_by_css_selector("input[value='para-mim']")
action = ActionChains(browser)
action.move_to_element(oCheckBox).click(oCheckBox).perform()
I am doing a self-project to keep learning and practicing with python3. I have done some other scraping proyects using BS4 and selenium but in this project I would like to do it with BS4.
In this project, I want to scrape some data from this site. The first problem I am facing is that I need to be logged in to get the data. For this test I am using a usser and password provided by the website, so you could use the same credentials. Also you must select a "race" from the form ( I choosed Manilla - Calbayog).
With the inspector I detect the the info I need to pass to the post function:
<input name="boat" type="text" />
<input name="key" type="password" />
<select name="race">
<option value="1159">Manilla - Calbayog</option> 'This is the one I want to check for the test
And this is my code:
from bs4 import BeautifulSoup
import requests
login_data = {'boat':'sol','key':'sol','race':'1159'}
s = requests.session()
post = s.post('http://sol.brainaid.de/sailonline/toolbox/', login_data)
r = requests.get('http://sol.brainaid.de/sailonline/toolbox/')
page = r.content
soup = BeautifulSoup(page, 'html.parser')
print(soup.prettify())
When I check the print output I can see that I am in the same login place.
Assuming that I could login correctly would come the second problem...When you are logged in, a new menu appears in button shapes. The one where the data I need to scrape is in "Navigation". The thing is that when you press the button the new info appears in the browser but the url does not change, no matter where you click, the url is always the same. So, how do I get to there?
And final problem. I assume I am in the "Navigation" section (without using a url). I need to refresh that info at least every 30 sec. How can I do that if there is no url to request?
¿Is there any way to do this without using selenium?
This page loads data dynamically through Ajax, the url with XML data of boat is http://sol.brainaid.de/sailonline/toolbox/getBoatData.php, you can check it in Firefox/Chrome network inspector. All you need is token, which is stored in cookies upon login:
from bs4 import BeautifulSoup
import requests
login_data = {'boat':'sol','key':'sol','race':'1159'}
login_url = 'http://sol.brainaid.de/sailonline/toolbox/login.php'
boat_data_url = 'http://sol.brainaid.de/sailonline/toolbox/getBoatData.php'
with requests.session() as s:
post = s.post(login_url, login_data)
data = {'boat': 'sol', 'race': '1159', 'token': s.cookies.get_dict()['sailonline[1159][sol]']}
boat_data = BeautifulSoup(s.post(boat_data_url, data=data).text, 'xml')
print(boat_data.prettify())
This will print:
<?xml version="1.0" encoding="utf-8"?>
<BOAT>
<LAT>
N 14°35.4000'
</LAT>
<LON>
E 120°57.0000'
</LON>
<DTG>
381.84
</DTG>
<DBL>
107.68
</DBL>
<TWD>
220.48
</TWD>
<TWS>
4.76
</TWS>
<WPT>
0
</WPT>
<RANK>
-
</RANK>
<lCOG>
COG
</lCOG>
<lTWA>
<u>TWA</u>
</lTWA>
<COG>
220.48
</COG>
<TWA>
000.00
</TWA>
<SOG>
0.00
</SOG>
<PERF>
100.00
</PERF>
<VMG>
0.00
</VMG>
<DATE>
2018-07-25
</DATE>
<TIME>
12:47:11
</TIME>
</BOAT>
I'm attempting to write a Twisted Webserver in Python 3.6 that can upload multiple files but being fairly new to both Python and things Web I have ran into an issue I don't understand and I also do not find any good examples relating to multiple file upload.
With the following code I get
from twisted.web import server, resource
from twisted.internet import reactor, endpoints
import cgi
class Counter(resource.Resource):
isLeaf = True
numberRequests = 0
def render_GET(self, request):
print("GET " + str(self.numberRequests))
self.numberRequests += 1
# request.setHeader(b"content-type", b"text/plain")
# content = u"I am request #{}\n".format(self.numberRequests)
content = """<html>
<body>
<form enctype="multipart/form-data" method="POST">
Text: <input name="text1" type="text" /><br />
File: <input name="file1" type="file" multiple /><br />
<input type="submit" />
</form>
</body>
</html>"""
print(request.uri)
return content.encode("ascii")
def render_POST(selfself, request):
postheaders = request.getAllHeaders()
try:
postfile = cgi.FieldStorage(
fp=request.content,
headers=postheaders,
environ={'REQUEST_METHOD': 'POST',
# 'CONTENT_TYPE': postheaders['content-type'], Gives builtins.KeyError: 'content-type'
}
)
except Exception as e:
print('something went wrong: ' + str(e))
filename = postfile["file"].filename #file1 tag also does not work
print(filename)
file = request.args["file"][0] #file1 tag also does not work
endpoints.serverFromString(reactor, "tcp:1234").listen(server.Site(Counter()))
reactor.run()
Error log
C:\Users\theuser\AppData\Local\conda\conda\envs\py36\python.exe C:/Users/theuser/PycharmProjects/myproject/twweb.py
GET 0
b'/'
# My comment POST starts here
Unhandled Error
Traceback (most recent call last):
File "C:\Users\theuser\AppData\Local\conda\conda\envs\py36\lib\site-packages\twisted\web\http.py", line 1614, in dataReceived
finishCallback(data[contentLength:])
File "C:\Users\theuser\AppData\Local\conda\conda\envs\py36\lib\site-packages\twisted\web\http.py", line 2029, in _finishRequestBody
self.allContentReceived()
File "C:\Users\theuser\AppData\Local\conda\conda\envs\py36\lib\site-packages\twisted\web\http.py", line 2104, in allContentReceived
req.requestReceived(command, path, version)
File "C:\Users\theuser\AppData\Local\conda\conda\envs\py36\lib\site-packages\twisted\web\http.py", line 866, in requestReceived
self.process()
--- <exception caught here> ---
File "C:\Users\theuser\AppData\Local\conda\conda\envs\py36\lib\site-packages\twisted\web\server.py", line 195, in process
self.render(resrc)
File "C:\Users\theuser\AppData\Local\conda\conda\envs\py36\lib\site-packages\twisted\web\server.py", line 255, in render
body = resrc.render(self)
File "C:\Users\theuser\AppData\Local\conda\conda\envs\py36\lib\site-packages\twisted\web\resource.py", line 250, in render
return m(request)
File "C:/Users/theuser/PycharmProjects/myproject/twweb.py", line 42, in render_POST
filename = postfile["file"].filename #file1 tag also does not work
File "C:\Users\theuser\AppData\Local\conda\conda\envs\py36\lib\cgi.py", line 604, in __getitem__
raise KeyError(key)
builtins.KeyError: 'file'
I don't understand how to get hold of the file or files so I can save them uploaded after the form Submit in render_POST. This post SO appears to not have this problem. The final app should be able to do this asynchronous for multiple users but before that I would be happy to get just a simple app working.
Using conda on Windows 10 with Python 3.6
FieldStorage in Python 3+ returns a dict with keys as bytes, not strings. So you must access keys like this:
postfile[ b"file" ]
Notice that the key is appended with b"". It's a tad bit confusing if you're new to Python and are unaware of the changes between Python 3 and 2 strings.
Also I answered a similar question a while back but was unable to get it working properly on Python 3.4 but I can't remember what exactly didn't work. Hopefully you don't run into any issues using 3.6.
I am trying to access a website to scrape some information, however I am having trouble posting login information through Python. Here is my code so far:
import requests
c = requests.Session()
url = 'https://subscriber.hoovers.com/H/login/login.html'
USERNAME = 'user'
PASSWORD = 'pass'
c.get(url)
csrftoken = c.cookies['csrftoken']
login_data = dict(j_username=USERNAME, j_password=PASSWORD,
csrfmiddlewaretoken=csrftoken, next='/')
c.post(url, data=login_data, headers=dict(Referer=url))
page = c.get('http://subscriber.hoovers.com/H/home/index.html')
print(page.content)
Here is the form data from the post login page:
j_username:user
j_password:pass
OWASP_CSRFTOKEN:8N0Z-TND5-NV71-C4N4-43BK-B13S-A1MO-NZQC
OWASP_CSRFTOKEN:8N0Z-TND5-NV71-C4N4-43BK-B13S-A1MO-NZQC
Here is the error I receive:
Traceback (most recent call last):
File "C:/Users/10023539/Desktop/pyscripts/webscraper ex.py", line 9, in <module>
csrftoken = c.cookies['csrftoken']
File "C:\Program Files (x86)\Python35-32\Lib\site-packages\requests\cookies.py", line 293, in __getitem__
return self._find_no_duplicates(name)
File "C:\Program Files (x86)\Python35-32\Lib\site-packages\requests\cookies.py", line 351, in _find_no_duplicates
raise KeyError('name=%r, domain=%r, path=%r' % (name, domain, path))
KeyError: "name='csrftoken', domain=None, path=None"
I believe the issue has something to do with the 'OWASP_CSRFTOKEN' label? I haven't found any solutions for this specific CSRF name anywhere online. I've also tried removing the c.cookies method and manually typing in the CSRF code into the csrfmiddlewaretoken argument. I've also tried changing the referal URL around, still getting the same error.
Any assistance would be greatly appreciated.
First of all you catch KeyError exception, this mean that cookies dictionary have no key csrftoken.
So you need explore your response for find right CSRF token cookie name.
For example you can print all cookies:
for key in c.cookies.keys():
print('%s: %s' % (key, c.cookies[key]))
UPD: Actually your response have no CSRF cookie.
you need look token in your c.text with pyquery
<input type="hidden" name="OWASP_CSRFTOKEN" class="csrfClass" value="X48L-NEYI-CG18-SJOD-VDW9-FGEB-7WIT-88P4">