I hope that you are having a good day.
I am working with the Binance api trough the Python-Binance.
My bot runs fine most of the time but for some reason after a while it randomly stops and gives me this error message. "APIError(code=-1099): Not found, unauthenticated, or unauthorized".
The error seems to happen at the same line of code everytime.
from binance.client import Client
import time
from datetime import datetime
import pandas as pd
class LastPrice():
def __init__(self, symbol):
self.symbol = symbol
self.api_key = "my key"
self.secret_key = "my secret"
def get_client(self):
try:
client = Client(self.api_key, self.secret_key)
except Exception as e:
print(e)
pass
return client
def get_last_price(self):
last_price_string = self.get_client().get_ticker(symbol=self.symbol)['lastPrice']
last_price = float(last_price_string)
return last_price_string
def get_last_price_and_time(self):
last_price_string = self.get_client().get_ticker(symbol=self.symbol)['lastPrice']
close_time = self.get_client().get_ticker(symbol=self.symbol)['closeTime']
last_price = float(last_price_string)
price_and_time = [close_time, last_price]
return price_and_time
error message
I Have included a screenshot of my error message i get. Despite using the same code to generate the client everytime i only get the error in that specific code. My code stops because the client is referenced before assignement but I am looking for the cause of the precedent APIError.
Any idea what that error meassage means in the context of my code ? Thanks in advance !
import dramatiq
from dramatiq.brokers.redis import RedisBroker
from dramatiq.results import Results
from dramatiq.results.backends import RedisBackend
broker = RedisBroker(host="127.0.0.1", port=6379)
broker.declare_queue("default")
dramatiq.set_broker(broker)
# backend = RedisBackend()
# broker.add_middleware(Results(backend=backend))
#dramatiq.actor()
def print_words(text):
print('This is ' + text)
print_words('sync')
a = print_words.send('async')
a.get_results()
I was checking alternatives to celery and found Dramatiq. I'm just getting started with dramatiq and I'm unable to retrieve results. I even tried setting the backend and 'save_results' to True. I'm always getting this AttributeError: 'Message' object has no attribute 'get_results'
Any idea on how to get the result?
You were on the right track with adding a result backend. The way to instruct an actor to store results is store_results=True, not save_results and the method to retrieve results is get_result(), not get_results.
When you run get_result() with block=False, you should wait the worker set result ready, like this:
while True:
try:
res = a.get_result(backend=backend)
break
except dramatiq.results.errors.ResultMissing:
# do something like retry N times.
time.sleep(1)
print(res)
I'm querying Facebook with the following code, iterating over a list of page names to get the pages' numeric ID and store it in a dictionary. I keep catching a HTTP 500 error, however; this doesn't appear in the short list I present here, though. See code:
import json
def FB_IDs(page_name, access_token=access_token):
""" get page's numeric information """
# construct URL
base = "https://graph.facebook.com/v2.4"
node = "/" + str(page_name)
parameters = "/?access_token=%s" % access_token
url = base + node + parameters
# retrieve data
with urllib.request.urlopen(url) as url:
data = json.loads(url.read().decode())
return data
pages_ids_dict = {}
for page in pages:
pages_ids_dict[page] = FacebookIDs(page, access_token)['id']
How can I automate this and avoid the error?
There is a pretty standard helper function for this, which you might want to look at:
### HELPER FUNCTION ###
def request_until_succeed(url):
""" helper function to catch HTTP error 500"""
req = urllib.request.Request(url)
success = False
while success is False:
try:
response = urllib.request.urlopen(req)
if response.getcode() == 200:
success = True
except Exception as e:
print(e)
time.sleep(5)
print("Error for URL") # use following code if URL shall be printed %s: %s" % (url, datetime.datetime.now()))
return response.read()
Implement that function into yours so your function calls that the URL through that one and you should be fine.
I want to save the result of a long running job on S3. The job is implemented in Python, so I'm using boto3. The user guide says to use S3.Client.upload_fileobj for this purpose which works fine, except I can't figure out how to check if the upload has succeeded. According to the documentation, the method doesn't return anything and doesn't raise an error. The Callback param seems to be intended for progress tracking instead of error checking. It is also unclear if the method call is synchronous or asynchronous.
If the upload failed for any reason, I would like to save the contents to the disk and log an error. So my question is: How can I check if a boto3 S3.Client.upload_fileobj call succeeded and do some error handling if it failed?
I use a combination of head_object and wait_until_exists.
import boto3
from botocore.exceptions import ClientError, WaiterError
session = boto3.Session()
s3_client = session.client('s3')
s3_resource = session.resource('s3')
def upload_src(src, filename, bucketName):
success = False
try:
bucket = s3_resource.Bucket(bucketName)
except ClientError as e:
bucket = None
try:
# In case filename already exists, get current etag to check if the
# contents change after upload
head = s3_client.head_object(Bucket=bucketName, Key=filename)
except ClientError:
etag = ''
else:
etag = head['ETag'].strip('"')
try:
s3_obj = bucket.Object(filename)
except ClientError, AttributeError:
s3_obj = None
try:
s3_obj.upload_fileobj(src)
except ClientError, AttributeError:
pass
else:
try:
s3_obj.wait_until_exists(IfNoneMatch=etag)
except WaiterError as e:
pass
else:
head = s3_client.head_object(Bucket=bucketName, Key=filename)
success = head['ContentLength']
return success
There is a wait_until_exists() helper function that seems to be for this purpose in the boto3.resource object.
This is how we are using it:
s3_client.upload_fileobj(file, BUCKET_NAME, file_path)
s3_resource.Object(BUCKET_NAME, file_path).wait_until_exists()
I would recommend you to perform the following operations-
try:
response = upload_fileobj()
except Exception as e:
save the contents to the disk and log an error.
if response is None:
polling after every 10s to check if the file uploaded successfully or not using **head_object()** function..
If you got the success response from head_object :
break
If you got error in accessing the object:
save the contents to the disk and log an error.
So , basically do poll using head_object()
I am executing a Python script with Threading, where given a "query" term that I put in the Queue, I create the url with the query parameters, set the cookies & parse the webpage to return the Products & the urls of those products. Here's the script.
Task : For a given set of queries, store the top 20 product ids in a file, or lower # if the query returns fewer results.
I remember reading that Selenium is not thread safe. Just want to make sure that this problem occurs because of that limitation, and is there a way to make it work in concurrent threads ? The main problem is that the script was I/O bound, so very slow for scraping about 3000 url fetches.
from pyvirtualdisplay import Display
from data_mining.scraping import scraping_conf as sf #custom file with rules for scraping
import Queue
import threading
import urllib2
import time
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
num_threads=5
COOKIES=sf.__MERCHANT_PARAMS[merchant_domain]['COOKIES']
query_args =sf.__MERCHANT_PARAMS[merchant_domain]['QUERY_ARGS']
class ThreadUrl(threading.Thread):
"""Threaded Url Grab"""
def __init__(self, queue, out_queue):
threading.Thread.__init__(self)
self.queue = queue
self.out_queue = out_queue
def url_from_query(self,query):
for key,val in query_args.items():
if query_args[key]=='query' :
query_args[key]=query
print "query", query
try :
url = base_url+urllib.urlencode(query_args)
print "url"
return url
except Exception as e:
log()
return None
def init_driver_and_scrape(self,base_url,query,url):
# Will use Pyvirtual display later
#display = Display(visible=0, size=(1024, 768))
#display.start()
fp = webdriver.FirefoxProfile()
fp.set_preference("browser.download.folderList",2)
fp.set_preference("javascript.enabled", True)
driver = webdriver.Firefox(firefox_profile=fp)
driver.delete_all_cookies()
driver.get(base_url)
for key,val in COOKIES[exp].items():
driver.add_cookie({'name':key,'value':val,'path':'/','domain': merchant_domain,'secure':False,'expiry':None})
print "printing cookie name & value"
for cookie in driver.get_cookies():
if cookie['name'] in COOKIES[exp].keys():
print cookie['name'],"-->", cookie['value']
driver.get(base_url+'search=junk') # To counter any refresh issues
driver.implicitly_wait(20)
driver.execute_script("window.scrollTo(0, 2000)")
print "url inside scrape", url
if url is not None :
flag = True
i=-1
row_data,row_res=(),()
while flag :
i=i+1
try :
driver.get(url)
key=sf.__MERCHANT_PARAMS[merchant_domain]['GET_ITEM_BY_ID']+str(i)
print key
item=driver.find_element_by_id(key)
href=item.get_attribute("href")
prod_id=eval(sf.__MERCHANT_PARAMS[merchant_domain]['PRODUCTID_EVAL_FUNC'])
row_res=row_res+(prod_id,)
print url,row_res
except Exception as e:
log()
flag =False
driver.delete_all_cookies()
driver.close()
return query+"|"+str(row_res)+"\n" # row_data, row_res
else :
return [query+"|"+"None"]+"\n"
def run(self):
while True:
#grabs host from queue
query = self.queue.get()
url=self.url_from_query(query)
print "query, url", query, url
data=self.init_driver_and_scrape(base_url,query,url)
self.out_queue.put(data)
#signals to queue job is done
self.queue.task_done()
class DatamineThread(threading.Thread):
"""Threaded Url Grab"""
def __init__(self, out_queue):
threading.Thread.__init__(self)
self.out_queue = out_queue
def run(self):
while True:
#grabs host from queue
data = self.out_queue.get()
fh.write(str(data)+"\n")
#signals to queue job is done
self.out_queue.task_done()
start = time.time()
def log():
logging_hndl=logging.getLogger("get_results_url")
logging_hndl.exception("Stacktrace from "+"get_results_url")
df=pd.read_csv(fh_query, sep='|',skiprows=0,header=0,usecols=None,error_bad_lines=False) # read all queries
query_list=list(df['query'].values)[0:3]
def main():
exp="Control"
#spawn a pool of threads, and pass them queue instance
for i in range(num_threads):
t = ThreadUrl(queue, out_queue)
t.setDaemon(True)
t.start()
#populate queue with data
print query_list
for query in query_list:
queue.put(query)
for i in range(num_threads):
dt = DatamineThread(out_queue)
dt.setDaemon(True)
dt.start()
#wait on the queue until everything has been processed
queue.join()
out_queue.join()
main()
print "Elapsed Time: %s" % (time.time() - start)
While I should be getting, all search results from each url page, I get only the 1st , i=0 search card and this doesn't execute for all queries/urls. What am I doing wrong ?
What I expect -
url inside scrape http://<masked>/search=nike+costume
searchResultsItem0
url inside scrape http://<masked>/search=red+tops
searchResultsItem0
url inside scrape http://<masked>/search=halloween+costumes
searchResultsItem0
and more searchResultsItem(s) , like searchResultsItem1,searchResultsItem2 and so on..
What I get
url inside scrape http://<masked>/search=nike+costume
searchResultsItem0
url inside scrape http://<masked>/search=nike+costume
searchResultsItem0
url inside scrape http://<masked>/search=nike+costume
searchResultsItem0
The skeleton code was taken from
http://www.ibm.com/developerworks/aix/library/au-threadingpython/
Additionally when I use Pyvirtual display, will that work with Threading as well ? I also used processes with the same Selenium code, and it gave the same error. Essentially it opens up 3 Firefox browsers, with exact urls, while it should be opening them from different items in the queue. Here I stored the rules in file that will import as sf, which has all custom attributes of a Base Domain.
Since setting the cookies is an integral part of my script, I can't use dryscrape.
EDIT :
I tried to localize the error, and here's what I found -
In the custom rules file, I call "sf" above, I had defined, QUERY_ARGS as
__MERCHANT_PARAMS = {
"some_domain.com" :
{
COOKIES: { <a dict of dict, masked here>
},
... more such rules
QUERY_ARGS:{'search':'query'}
}
So what is really happening is , that on calling,
query_args =sf.__MERCHANT_PARAMS[merchant_domain]['QUERY_ARGS'] - this should return the dict
{'search':'query'}, while it returns,
AttributeError: 'module' object has no attribute
'_ThreadUrl__MERCHANT_PARAMS'
This is where I don't understand how the thread is passing '_ThreadUrl__' I also tried re-initializing the query_args,inside the url_from_query method, but this doesn't work.
Any pointers, on what am I doing wrong ?
I may be replying pretty late to this. However, I tested it python2.7 and both options multithreading and mutliprocess works with selenium and it is opening two separate browsers.