Solving recaptcha with anticaptcha using Python - python-3.x

I am trying to fill recaptcha using anticaptcha api.
But I am unable to figure out how to submit response.
Here is what I am trying to do:
driver.switch_to.frame(driver.find_element_by_xpath('//iframe'))
site_key = '6Ldd2doaAAAAAFhvJxqgQ0OKnYEld82b9FKDBnRE'
api_key = 'api_keys'
url = 'https://coinsniper.net/register'
client = AnticaptchaClient(api_key)
task = NoCaptchaTaskProxylessTask(url, site_key)
job = client.createTask(task)
job.join()
driver.execute_script("document.getElementById('g-recaptcha-response').innerHTML='{}';".format(job.get_solution_response()))
driver.refresh()
Above code snippet only refreshes the same page and not redirecting to input url.
Then I see that there is a variable in script on the same page and I tried to execute that variable too to submit form just like that
driver.execute_script("var captchaSubmitEl = document.getElementById('captcha-submit');")
driver.refresh()
Which also fails.The webpage is here.
Second try with this url which is loading recpatcha of the same page.
But this time I tried with different site_key and url which were extracted as below
url_key = driver.find_element_by_xpath('//*[#id="captcha-submit"]/div/div/iframe').get_attribute('src')
site_key = re.search('k=([^&]+)',url_key).group(1)
url = 'https://geo.captcha-delivery.com/captcha/?initialCid=AHrlqAAAAAMABhLJ2Rn0V78AZ5gFAg%3D%3D&hash=7F23E8F8FB0B33347C06D1347938C1&cid=.z5o-mMJuvaX_CLxOMBRebJsY6NgZvUv87bLMft~A_st0Fkvl~3jcaTr1R64GU7xO.WZFYNq5P3.UNuLWFa32.Pe6GGuIV7Y5w-RaMu0K3&t=fe&referer=https%3A%2F%2Fcoinsniper.net%2Fregister&s=33682'
client = AnticaptchaClient(api_key)
task = NoCaptchaTaskProxylessTask(url, site_key)
job = client.createTask(task)
job.join()
driver.execute_script("document.getElementById('g-recaptcha-response').innerHTML='{}';".format(job.get_solution_response()))
driver.refresh()
Above both ways are, I don't know why but, not working. I am finding solution from previous 3 days and got not any single solution working in my case.
Can anyone look into this and let me know what is wrong with this code.

After you receive a response from anti-captcha you should set it to this element
<input type="hidden" class="mtcaptcha-verifiedtoken" name="mtcaptcha-verifiedtoken" id="mtcaptcha-verifiedtoken-1" readonly="readonly" value="">
Fill in all other fields on UI and click the Register button.
You should not refresh the page.

Related

Why do I fail to submit data to textarea with python requests.post()

I want to use the requests.post tool to automatically query domain name attribution on this websitea website,But the return value is always empty, I guess it is because the post method failed to transfer the data to the textarea
url = 'http://icp.chinaz.com/searchs'
data = {
'hosts':'qq.com',
'btn_search':'%E6%9F%A5%E8%AF%A2', '__RequestVerificationToken=':'CfDJ8KmGV0zny1FLtrcRZkGCczG2owvCRigVmimqp33mw5t861kI7zU2VfQBri65hIwu_4VXbmtqImMmTeZzqxE7NwwvAtwWcZSMnk6UDuzP3Ymzh9YrHPJkj_cy8vSLXvUYXrFO0HnCb0pSweHIL9pkReY',
}
requests.post(url=url,data=data,headers=headers).content.decode('utf-8')
I'd be very grateful if you could point out where I'm going wrong
I have tried to replace headers and so on.

Selenium Stale Element Reference Errors (Seems Random)?

I know there have been several questions asked regarding stale elements, but I can't seem to resolve these.
My site is private so unfortunately can't share, but seems to always throw the error somewhere within the below for-loop. This loop is meant to get the text of each row in a table (number of rows varies). I've assigned WebDriverWait commands and have a very similar for-loop earlier in my code to do the same thing in another table on the website which works perfectly. I've also tried including the link click command and table, body, and tableText definition inside the loop to redefine at every iteration.
Once the code stops and the error message displays (stale element reference: element is not attached to the page document (Session info: chrome=89.0.4389.128)), if I manually run everything line-by-line, it all seems to work and correctly grabs the text.
Any ideas? Thanks!
link = WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.LINK_TEXT, "*link address*")))
link.click()
table = WebDriverWait(driver, 10).until(EC.visibility_of_element_located((By.ID, "TableId")))
body = tableSig.find_element(By.CLASS_NAME, "*table body class*")
tableText = body.find_elements(By.TAG_NAME, "tr")
rows = len(tableText)
approvedSigs = [None]*rows
for i in range(1, rows+1):
approvedSigs[i-1] = (tableText[i-1].text)
approvedSigs[i-1] = approvedSigs[i-1].lstrip()
approvedSigs[i-1] = approvedSigs[i-1][9:]
approvedSigs[i-1] = approvedSigs[i-1].replace("\n"," ")

Using XMLHttpRequest To Paste Clipboard

I am currently using Flask to create a website and have come across an interesting issue. There is some code that gives the user the option to input a value in for about 20 separate input fields. What I am trying to do is construct a button that would allow the user to paste in a column from an Excel table. Essentially, a button that will look at the clipboard, take the field, convert the string into an array, and place the values into each input in the order they appear in the list.
So far, I have been able to get the clipboard into a string using tk.Tk().clipboard_get(), and believe that I can get this value by making an XMLHttpRequest, but have had little luck in making it actually work.
Some code for what I am trying to accomplish:
Python:
#app.route('/some/path/here', methods = ['GET'])
def paste():
try:
values = tk.Tk().clipboard_get()
values = values.replace('\n',',')
return values
except:
return None
HTTP:
<button type="button" style="float: right" onclick="Testing()">Paste</button>
<p id="textHere"></p>
JavaScript:
<script>
function Testing() {
var wvpst = new XMLHttpRequest();
wvpst.onreadystatechange = function(){
if (this.readyState == 4 && this.status == 200) {
var list = this.responseXML;
// list = list.replace(/'/g,"").replace(/ '/g,"");
// list = list.split(", ");
document.getElementById("textHere").innerHTML = list;
}
}
wvpst.open("GET","{{ url_for('paste') }}",true);
wvpst.send();
}
</script>
For now, I am just trying to get the list of values copied from an Excel sheet, but nothing is being returned when the button is pressed. Am I simply using XMLHttpRequest incorrectly or is there something else I need to do to get this to work?
Set a debug breakpoint inside
if (this.readyState == 4 && this.status == 200) {
}
and inspect your response. Set another on the first line of your function in Flask. Those should give you visibility into where the breakdown is.
A few other, perhaps more important notes:
Point 1) In your flask try/except, on failure you should serve a response, just a 500 response. Replace return None with:
return app.make_response('Couldn't parse clipboard information!'), 500
Point 2) There is no need to pass this information to your server for processing. You can accomplish this within the javascript of the front end and save your server some processing and your client some time waiting on an HTTP response.
Have the user paste their content into a textbox or another element, and then access the value from there.
Direct clipboard access isn't something most browsers give up freely, and so best to avoid that route.
Summary:
Your xmlhttprequest looks fine to me. I would guess that your try in flask is failing and returning something useless if anything at all.
Do this in javascript.

Selenium scrapes only one result and ignores other related reults

I am new to selenium. Searching a web site, I get 10 results for each page. Those results are shown as lists (li tags) on the page and each list contains the same attributes. When my conditions are met, I go to another related web page and get desired content. However, when my code keeps looping for the lists, it fails to find the same attributes for the others. Here is my code:
p_url = "https://www.linkedin.com/vsearch/f?keywords=BARCO%2BNV%2Bkortrijk&pt=people&page_num=5"
driver.get(p_url)
time.sleep(5)
results = driver.find_element_by_id("results-container")
employees = results.find_elements_by_tag_name('li')
#emp_list = []
#for i in range(len(employees)):
# emp_list.append(employees[i])
for emp in employees:
try:
main_emp = emp.find_element_by_css_selector("a.title.main-headline")
name = emp.find_element_by_css_selector("a.title.main-headline").text
href = main_emp.get_attribute("href")
if name != "LinkedIn Member":
location = emp.find_element_by_class_name("demographic").text
href = main_emp.get_attribute("href")
print(href)
print(location)
driver.get(href)
exp = driver.find_element_by_id("background-experience")
amkk = exp.find_elements_by_class_name("editable-item")
for amk in amkk:
him = amk.find_element_by_tag_name("header").text
him2 = amk.find_element_by_class_name("experience-date-locale").text
if '\n' in him:
a = him.split('\n')
print(a[0])
print(a[1])
print(him2)
except Exception as exc:
print(exc)
continue
In this code the line main_emp = emp.find_element_by_css_selector("a.title.main-headline") stop working after it works for the first time. As a result I got an error of Message: stale element reference: element is not attached to the page document
From stackoverflow questions I saw that some say the content is removed from DOM structure and from another post someone suggested to fill a list with the results. Here what I have tried emp_list = []
for i in range(len(employees)):
emp_list.append(employees[i]) , however, it also did not work out.
How can I overcome this?
The selector you are using is wrong. You are getting the results using the results-container id. This works fine, but the collecting the elements form this is not working. It is returning more elements than just the employees (I'm not quite sure why).
If you change you selectors to this single selector you will get just the employees and no other unwanted elements.
employees = results.find_elements_by_css_selector("ol[id='results']>li")
Edit
Since you are opening the employees and losing the list of elements you might want to try opening the employee in a new tab, perform your actions here and close the tab afterwards.
Example:
for emp in employees:
try:
main_emp = emp.find_element_by_css_selector("a.title.main-headline")
# Do stuff you need...
# Open employee in new tab (make sure Keys is imported)
main_emp.send_keys(Keys.CONTROL + 't')
# Focus on new tab
driver.switch_to_window(d.window_handles[1])
# Do stuff inside the employee page
# Close the tab you opened
driver.close()
# Switch back to the first tab
driver.switch_to_window(d.window_handles[0])
Note: For OSX you should use main_emp.send_keys(Keys.COMMAND + 't')

Correct way to handle pagination with form submission?

I have a form for doing a search on a search page:
<form action="{{ url_for('searchresults') }}" method="get" name="noname" id="theform">
{{ form2.page(id="hiddenpage") }}
... some form inputs
<button id = "mybutton" type = "submit" >Apply</button>
</form>
The form is a SearchForm, where
class SearchForm(Form):
page = HiddenField()
categories = SelectMultipleField(u'Text', validators=[Optional()])
# some other stuff...
The view for searchresults handles the form:
#app.route('/searchresults', methods=['GET'])
def searchresults():
form = SearchForm()
# handle the form and get the search results using pagination
page = int(request.args.getlist('page')[0])
results = models.Product.query....paginate(page, 10, False)
return render_template('searchresults.html', form=form, results=results, curpage=page)
The results in render_template will be the first 10 results of my query. In searchresults.html I display the results, along with a next and previous link for the other results. This page also contains the same search form which I re-instate as per the initial submission. Currently I'm handling the next and previous links using
Next
So the next link re-submits the same initial form, but with the page value increased. I'm not really happy with this approach because when I hover over the next link I don't see the actual page I will be directed to. It also is starting to feel like a bit of a hack. Is there a more natural way to do this? When the form is initially submitted I could use it to create a long string of the desired parameters and use that string in the rendered page as href=" {{ url_for('searchresults') }}?mystring", but this also seems unnatural. How should I handle this?
You have your form configured to submit as a GET request, so you don't really need to force a re-submission through Javascript, you can achieve the same result by setting the next and prev links to URLs that include all the parameters in your form with the page modified to the correct next and previous page numbers.
This is really easy to do with url_for(). Any argument you add that do not match route components will be added to the query string, so you can do something like this:
Next
One thing to keep in mind is CSRF. If you have it enabled in your form, then your next/prev URLs will also need to have a valid token. Or you can disable CSRF, which for a search form might be okay.
Take advantage of the fact that your form arguments are already present in the URL and use request.args to pass the URL parameters into your form:
form = SearchForm(request.args)
Then, if you make your page field an IntegerField with a HiddenInput widget instead of a string field:
from wtforms.widgets import HiddenInput
class SearchForm(Form):
page = HiddenField(widget=HiddenInput, default=1)
you can increment page before you pass the form off to your search results page:
form.page.data += 1
And then, in your page, you simply create the link to the next page:
Next

Resources