googletrans Translate() not working on Spyder but works on Colab - python-3.x

I am using googletrans Translator on offline data in local repo:
translator = Translator()
translations = []
for element in df['myText']:
translations.append(translator.translate(element).text)
df['translations'] = translations
On Google Colab it works fine(20 mins) but on my machine it takes 30 mins and stops with ReadTimeout error:
File "<ipython-input-9-2209313a9a78>", line 4, in <module>
translations.append(translator.translate(element).text)
File "C:\Anaconda3\lib\site-packages\googletrans\client.py", line 182, in translate
data = self._translate(text, dest, src, kwargs)
File "C:\Anaconda3\lib\site-packages\googletrans\client.py", line 83, in _translate
r = self.client.get(url, params=params)
File "C:\Anaconda3\lib\site-packages\httpx\_client.py", line 763, in get
timeout=timeout,
File "C:\Anaconda3\lib\site-packages\httpx\_client.py", line 601, in request
request, auth=auth, allow_redirects=allow_redirects, timeout=timeout,
File "C:\Anaconda3\lib\site-packages\httpx\_client.py", line 621, in send
request, auth=auth, timeout=timeout, allow_redirects=allow_redirects,
File "C:\Anaconda3\lib\site-packages\httpx\_client.py", line 648, in send_handling_redirects
request, auth=auth, timeout=timeout, history=history
File "C:\Anaconda3\lib\site-packages\httpx\_client.py", line 684, in send_handling_auth
response = self.send_single_request(request, timeout)
File "C:\Anaconda3\lib\site-packages\httpx\_client.py", line 719, in send_single_request
timeout=timeout.as_dict(),
File "C:\Anaconda3\lib\site-packages\httpcore\_sync\connection_pool.py", line 153, in request
method, url, headers=headers, stream=stream, timeout=timeout
File "C:\Anaconda3\lib\site-packages\httpcore\_sync\connection.py", line 78, in request
return self.connection.request(method, url, headers, stream, timeout)
File "C:\Anaconda3\lib\site-packages\httpcore\_sync\http11.py", line 62, in request
) = self._receive_response(timeout)
File "C:\Anaconda3\lib\site-packages\httpcore\_sync\http11.py", line 115, in _receive_response
event = self._receive_event(timeout)
File "C:\Anaconda3\lib\site-packages\httpcore\_sync\http11.py", line 145, in _receive_event
data = self.socket.read(self.READ_NUM_BYTES, timeout)
File "C:\Anaconda3\lib\site-packages\httpcore\_backends\sync.py", line 62, in read
return self.sock.recv(n)
File "C:\Anaconda3\lib\contextlib.py", line 130, in __exit__
self.gen.throw(type, value, traceback)
File "C:\Anaconda3\lib\site-packages\httpcore\_exceptions.py", line 12, in map_exceptions
raise to_exc(exc) from None
ReadTimeout: The read operation timed out
My machine: 16 GB Ram (i5 + NVIDIA);
Google Colab RAM: 0.87 GB/12.72 GB
# Data Size
len(df) : 1800
Not sure why doesn't it run on my local machine? I have worked on heavier datasets before.
I am using Python 3 (Spyder 4.0).

I'm having some problems with this translation as well... It appears that the error you're getting has nothing to do with your machine, but with the request to the API timing out. Try and pass a Timeout object from the httpx library to the Translator builder. Something like this:
import httpx
timeout = httpx.Timeout(5) # 5 seconds timeout
translator = Translator(timeout=timeout)
You can change to 5 to another value, if needed. It has solver the problem for me, so far.

Related

How do I stop my script from timing out when I try to extract data from Google Search Console to BigQuery?

I'm trying to export data from Google Search Console to Big Query using Python. However it keeps timing out and I keep getting the following error:
Traceback (most recent call last):
File "/Users/markleach/Python/GSC_BQ/gsc_bq.py", line 147, in <module>
y = get_sc_df(p,"2021-12-01","2022-12-01",x)
File "/Users/markleach/Python/GSC_BQ/gsc_bq.py", line 71, in get_sc_df
response = service.searchanalytics().query(siteUrl=site_url, body=request).execute()
File "/Users/markleach/opt/anaconda3/envs/Sandpit/lib/python3.9/site-packages/googleapiclient/_helpers.py", line 130, in positional_wrapper
return wrapped(*args, **kwargs)
File "/Users/markleach/opt/anaconda3/envs/Sandpit/lib/python3.9/site-packages/googleapiclient/http.py", line 923, in execute
resp, content = _retry_request(
File "/Users/markleach/opt/anaconda3/envs/Sandpit/lib/python3.9/site-packages/googleapiclient/http.py", line 222, in _retry_request
raise exception
File "/Users/markleach/opt/anaconda3/envs/Sandpit/lib/python3.9/site-packages/googleapiclient/http.py", line 191, in _retry_request
resp, content = http.request(uri, method, *args, **kwargs)
File "/Users/markleach/opt/anaconda3/envs/Sandpit/lib/python3.9/site-packages/google_auth_httplib2.py", line 218, in request
response, content = self.http.request(
File "/Users/markleach/opt/anaconda3/envs/Sandpit/lib/python3.9/site-packages/httplib2/__init__.py", line 1720, in request
(response, content) = self._request(
File "/Users/markleach/opt/anaconda3/envs/Sandpit/lib/python3.9/site-packages/httplib2/__init__.py", line 1440, in _request
(response, content) = self._conn_request(conn, request_uri, method, body, headers)
File "/Users/markleach/opt/anaconda3/envs/Sandpit/lib/python3.9/site-packages/httplib2/__init__.py", line 1392, in _conn_request
response = conn.getresponse()
File "/Users/markleach/opt/anaconda3/envs/Sandpit/lib/python3.9/http/client.py", line 1377, in getresponse
response.begin()
File "/Users/markleach/opt/anaconda3/envs/Sandpit/lib/python3.9/http/client.py", line 320, in begin
version, status, reason = self._read_status()
File "/Users/markleach/opt/anaconda3/envs/Sandpit/lib/python3.9/http/client.py", line 281, in _read_status
line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
File "/Users/markleach/opt/anaconda3/envs/Sandpit/lib/python3.9/socket.py", line 704, in readinto
return self._sock.recv_into(b)
File "/Users/markleach/opt/anaconda3/envs/Sandpit/lib/python3.9/ssl.py", line 1242, in recv_into
return self.read(nbytes, buffer)
File "/Users/markleach/opt/anaconda3/envs/Sandpit/lib/python3.9/ssl.py", line 1100, in read
return self._sslobj.read(len, buffer)
socket.timeout: The read operation timed out
This is the code around the mentioned lines:
for p in PROPERTIES:
for x in range(0,100000,25000):
y = get_sc_df(p,"2021-12-01","2022-12-01",x)
if len(y) < 25000:
break
else:
continue
I'd be grateful for any advice on how to correct this. Thanks in advance.
Mark
ps whole code in picture format below as i'm not allowed to have more code than text
Script was timing out because of the size of the dataset required.

Python RecursionError raised Requests 2.26 and Firebase 5.1

Since a week I'm facing a weird bug on my python server.
Right now it is running on Requests version 2.23.0 without issue.
Because of vulnerability issue I'd like to bump the Requests version to the 2.26.0.
My servers runs ok until I try to run a piece of code like that:
import requests
from firebase_admin import auth
bearer_token = requests.headers['X-Bearer-Token'] # Usually there is the word `Bearer`, consider we remove it.
decoded_token = auth.verify_id_token(bearer_token, check_revoked=False)
This piece of code will raise:
RecursionError: maximum recursion depth exceeded
Full error:
Traceback (most recent call last):
File "./project/handlers/users.py", line 106, in get_user
decoded_token = auth.verify_id_token(a, check_revoked=False)
File "./project/venv/lib/python3.6/site-packages/firebase_admin/auth.py", line 220, in verify_id_token
return client.verify_id_token(id_token, check_revoked=check_revoked)
File "./project/venv/lib/python3.6/site-packages/firebase_admin/_auth_client.py", line 127, in verify_id_token
verified_claims = self._token_verifier.verify_id_token(id_token)
File "./project/venv/lib/python3.6/site-packages/firebase_admin/_token_gen.py", line 293, in verify_id_token
return self.id_token_verifier.verify(id_token, self.request)
File "./project/venv/lib/python3.6/site-packages/firebase_admin/_token_gen.py", line 396, in verify
certs_url=self.cert_url)
File "./project/venv/lib/python3.6/site-packages/google/oauth2/id_token.py", line 124, in verify_token
certs = _fetch_certs(request, certs_url)
File "./project/venv/lib/python3.6/site-packages/google/oauth2/id_token.py", line 98, in _fetch_certs
response = request(certs_url, method="GET")
File "./project/venv/lib/python3.6/site-packages/firebase_admin/_token_gen.py", line 266, in __call__
url, method=method, body=body, headers=headers, timeout=timeout, **kwargs)
File "./project/venv/lib/python3.6/site-packages/google/auth/transport/requests.py", line 184, in __call__
method, url, data=body, headers=headers, timeout=timeout, **kwargs
File "./project/venv/lib/python3.6/site-packages/requests/sessions.py", line 542, in request
resp = self.send(prep, **send_kwargs)
File "./project/venv/lib/python3.6/site-packages/requests/sessions.py", line 655, in send
r = adapter.send(request, **kwargs)
File "./project/venv/lib/python3.6/site-packages/cachecontrol/adapter.py", line 57, in send
resp = super(CacheControlAdapter, self).send(request, **kw)
File "./project/venv/lib/python3.6/site-packages/requests/adapters.py", line 449, in send
timeout=timeout
File "./project/venv/lib/python3.6/site-packages/urllib3/connectionpool.py", line 600, in urlopen
chunked=chunked)
File "./project/venv/lib/python3.6/site-packages/urllib3/connectionpool.py", line 343, in _make_request
self._validate_conn(conn)
File "./project/venv/lib/python3.6/site-packages/urllib3/connectionpool.py", line 839, in _validate_conn
conn.connect()
File "./project/venv/lib/python3.6/site-packages/urllib3/connection.py", line 332, in connect
cert_reqs=resolve_cert_reqs(self.cert_reqs),
File "./project/venv/lib/python3.6/site-packages/urllib3/util/ssl_.py", line 281, in create_urllib3_context
context.options |= options
File "/usr/local/lib/python3.6/ssl.py", line 465, in options
super(SSLContext, SSLContext).options.__set__(self, value)
File "/usr/local/lib/python3.6/ssl.py", line 465, in options
super(SSLContext, SSLContext).options.__set__(self, value)
File "/usr/local/lib/python3.6/ssl.py", line 465, in options
super(SSLContext, SSLContext).options.__set__(self, value)
[Previous line repeated 963 more times]
File "/usr/local/lib/python3.6/ssl.py", line 463, in options
#options.setter
RecursionError: maximum recursion depth exceeded
Librairies:
requests = ^2.24.0
firebase-admin = ^5.0.0
I solved it by removing the Eventlet Monkey patch.
eventlet.monkey_patch()

AWS Lambda Python Function with pygsheets causing [Errno 97] Address family not supported by protocol

I am using python and the serverless framework to deploy functions to AWS Lambda. I have been running into an issue when I want to run a SQL query and then output the results into a google sheet. My code functions perfectly locally, but not on AWS lambda.
I am also using a VPC configuration, however, I have already verified it has internet access and everything else in my code functions properly in Lambda.
gc = pygsheets.authorize(service_account_env_var=service_account)
ws = gc.open(workbook_name)
wks = ws.worksheet_by_title(sheet_name)
wks.set_dataframe(df, (row_num, col_num))
The code breaks when calling ws = gc.open(workbook_name) and returns the error.
gc = pygsheets.authorize(service_account_env_var=service_account)
This code works when creating my pygsheets client.
Bellow is the error message returned:
[ERROR] OSError: [Errno 97] Address family not supported by protocol
Traceback (most recent call last):
File "/var/task/handler.py", line 70, in main
pd_to_gsheets(workbook_name=workbook, sheet_name=test_sheet, df=df, row_num=1, col_num=1)
File "/var/task/handler.py", line 31, in pd_to_gsheets
ws = gc.open(workbook_name)
File "/var/task/pygsheets/client.py", line 143, in open
spreadsheet = list(filter(lambda x: x['name'] == title, self.drive.spreadsheet_metadata()))[0]
File "/var/task/pygsheets/drive.py", line 144, in spreadsheet_metadata
return self._metadata_for_mime_type(self._spreadsheet_mime_type, query, only_team_drive)
File "/var/task/pygsheets/drive.py", line 169, in _metadata_for_mime_type
return self.list(fields=FIELDS_TO_INCLUDE,
File "/var/task/pygsheets/drive.py", line 87, in list
response = self._execute_request(self.service.files().list(**kwargs))
File "/var/task/pygsheets/drive.py", line 427, in _execute_request
return request.execute(num_retries=self.retries)
File "/var/task/googleapiclient/_helpers.py", line 134, in positional_wrapper
return wrapped(*args, **kwargs)
File "/var/task/googleapiclient/http.py", line 905, in execute
resp, content = _retry_request(
File "/var/task/googleapiclient/http.py", line 176, in _retry_request
resp, content = http.request(uri, method, *args, **kwargs)
File "/var/task/google_auth_httplib2.py", line 209, in request
self.credentials.before_request(self._request, method, uri, request_headers)
File "/var/task/google/auth/credentials.py", line 133, in before_request
self.refresh(request)
File "/var/task/google/oauth2/service_account.py", line 376, in refresh
access_token, expiry, _ = _client.jwt_grant(
File "/var/task/google/oauth2/_client.py", line 153, in jwt_grant
response_data = _token_endpoint_request(request, token_uri, body)
File "/var/task/google/oauth2/_client.py", line 105, in _token_endpoint_request
response = request(method="POST", url=token_uri, headers=headers, body=body)
File "/var/task/google_auth_httplib2.py", line 119, in __call__
response, data = self.http.request(
File "/var/task/httplib2/__init__.py", line 1708, in request
(response, content) = self._request(
File "/var/task/httplib2/__init__.py", line 1424, in _request
(response, content) = self._conn_request(conn, request_uri, method, body, headers)
File "/var/task/httplib2/__init__.py", line 1346, in _conn_request
conn.connect()
File "/var/task/httplib2/__init__.py", line 1182, in connect
raise socket_err
File "/var/task/httplib2/__init__.py", line 1132, in connect
sock = socket.socket(family, socktype, proto)
File "/var/lang/lib/python3.8/socket.py", line 231, in __init__
_socket.socket.__init__(self, family, type, proto, fileno)

JupyterLab/Elyra: pipeline run on Kubeflow Pipelines fails with "No host specified" in local deployment

I have Kubeflow Pipelines running in my local environment, along with JupyterLab and the Elyra extensions. I've created a notebook pipeline and configured the runtime configuration as follows, setting api_endpoint to http://localhost:31380/pipeline (with security disabled). Trying to run the pipeline the following error message is displayed:
Traceback (most recent call last):
File "/usr/local/lib/python3.8/site-packages/tornado/web.py", line 1703, in _execute
result = await result
File "/usr/local/lib/python3.8/site-packages/elyra/pipeline/handlers.py", line 89, in post
response = await PipelineProcessorManager.instance().process(pipeline)
File "/usr/local/lib/python3.8/site-packages/elyra/pipeline/processor.py", line 70, in process
res = await asyncio.get_event_loop().run_in_executor(None, processor.process, pipeline)
File "/usr/local/Cellar/python#3.8/3.8.6/Frameworks/Python.framework/Versions/3.8/lib/python3.8/concurrent/futures/thread.py", line 57, in run
result = self.fn(*self.args, **self.kwargs)
File "/usr/local/lib/python3.8/site-packages/elyra/pipeline/processor_kfp.py", line 100, in process
raise lve
File "/usr/local/lib/python3.8/site-packages/elyra/pipeline/processor_kfp.py", line 89, in process
client.upload_pipeline(pipeline_path,
File "/usr/local/lib/python3.8/site-packages/kfp/_client.py", line 720, in upload_pipeline
response = self._upload_api.upload_pipeline(pipeline_package_path, name=pipeline_name, description=description)
File "/usr/local/lib/python3.8/site-packages/kfp_server_api/api/pipeline_upload_service_api.py", line 83, in upload_pipeline
return self.upload_pipeline_with_http_info(uploadfile, **kwargs) # noqa: E501
File "/usr/local/lib/python3.8/site-packages/kfp_server_api/api/pipeline_upload_service_api.py", line 177, in upload_pipeline_with_http_info
return self.api_client.call_api(
File "/usr/local/lib/python3.8/site-packages/kfp_server_api/api_client.py", line 378, in call_api
return self.__call_api(resource_path, method,
File "/usr/local/lib/python3.8/site-packages/kfp_server_api/api_client.py", line 195, in __call_api
response_data = self.request(
File "/usr/local/lib/python3.8/site-packages/kfp_server_api/api_client.py", line 421, in request
return self.rest_client.POST(url,
File "/usr/local/lib/python3.8/site-packages/kfp_server_api/rest.py", line 279, in POST
return self.request("POST", url,
File "/usr/local/lib/python3.8/site-packages/kfp_server_api/rest.py", line 196, in request
r = self.pool_manager.request(
File "/usr/local/lib/python3.8/site-packages/urllib3/request.py", line 79, in request
return self.request_encode_body(
File "/usr/local/lib/python3.8/site-packages/urllib3/request.py", line 171, in request_encode_body
return self.urlopen(method, url, **extra_kw)
File "/usr/local/lib/python3.8/site-packages/urllib3/poolmanager.py", line 325, in urlopen
conn = self.connection_from_host(u.host, port=u.port, scheme=u.scheme)
File "/usr/local/lib/python3.8/site-packages/urllib3/poolmanager.py", line 231, in connection_from_host
raise LocationValueError("No host specified.")
urllib3.exceptions.LocationValueError: No host specified.
The root cause is an issue in the Kubeflow Pipelines kfp package version 1.0.0 that is distributed with Elyra v1.4.1 (and lower). To work around the issue, replace localhost with 127.0.0.1 in your runtime configuration, e.g. http://127.0.0.1:31380/pipeline.
Edit: With the availability of Elyra v1.5+, which requires a more recent version of the kfp package, you can also upgrade Elyra to resolve the issue.

ReadTimeOut Error while writer.write(xlist) in Alibaba odps connection. Any Suggestion?

from odps import ODPS
from odps import options
import csv
import os
from datetime import timedelta, datetime
options.sql.use_odps2_extension = True
options.tunnel.use_instance_tunnel = True
options.connect_timeout = 60
options.read_timeout=130
options.retry_times = 7
options.chunk_size = 8192*2
odps = ODPS('id','secret','project', endpoint ='endpointUrl')
table = odps.get_table('eventTable')
def uploadFile(file):
with table.open_writer(partition=None) as writer:
with open(file, 'rt') as csvfile:
rows = csv.reader(csvfile, delimiter='~')
for final in rows:
writer.write(final)
writer.close();
uploadFile('xyz.csv')
Assume I pass number of files in uploadFile call one by one from directory To connect alibaba cloud from python to migrate data into max compute table over the cloud. When I run this code, service stops either after working long time or at night time. It reports me error Read Time Out Error at line writer.write(final).
Error:
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/urllib3/response.py", line 226, in _error_catcher
yield
File "/usr/lib/python3/dist-packages/urllib3/response.py", line 301, in read
data = self._fp.read(amt)
File "/usr/lib/python3.5/http/client.py", line 448, in read
n = self.readinto(b)
File "/usr/lib/python3.5/http/client.py", line 488, in readinto
n = self.fp.readinto(b)
File "/usr/lib/python3.5/socket.py", line 575, in readinto
return self._sock.recv_into(b)
socket.timeout: timed out
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/requests/models.py", line 660, in generate
for chunk in self.raw.stream(chunk_size, decode_content=True):
File "/usr/lib/python3/dist-packages/urllib3/response.py", line 344, in stream
data = self.read(amt=amt, decode_content=decode_content)
File "/usr/lib/python3/dist-packages/urllib3/response.py", line 311, in read
flush_decoder = True
File "/usr/lib/python3.5/contextlib.py", line 77, in __exit__
self.gen.throw(type, value, traceback)
File "/usr/lib/python3/dist-packages/urllib3/response.py", line 231, in _error_catcher
raise ReadTimeoutError(self._pool, None, 'Read timed out.')
requests.packages.urllib3.exceptions.ReadTimeoutError: HTTPConnectionPool(host='dt.odps.aliyun.com', port=80): Read timed out.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/dataUploader.py", line 66, in <module>
uploadFile('xyz.csv')
File "/dataUploader.py", line 53, in uploadFile
writer.write(final)
File "/usr/local/lib/python3.5/dist-packages/odps/models/table.py", line 643, in __exit__
self.close()
File "/usr/local/lib/python3.5/dist-packages/odps/models/table.py", line 631, in close
upload_session.commit(written_blocks)
File "/usr/local/lib/python3.5/dist-packages/odps/tunnel/tabletunnel.py", line 308, in commit
in self.get_block_list()])
File "/usr/local/lib/python3.5/dist-packages/odps/tunnel/tabletunnel.py", line 298, in get_block_list
self.reload()
File "/usr/local/lib/python3.5/dist-packages/odps/tunnel/tabletunnel.py", line 238, in reload
resp = self._client.get(url, params=params, headers=headers)
File "/usr/local/lib/python3.5/dist-packages/odps/rest.py", line 138, in get
return self.request(url, 'get', stream=stream, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/odps/rest.py", line 125, in request
proxies=self._proxy)
File "/usr/lib/python3/dist-packages/requests/sessions.py", line 608, in send
r.content
File "/usr/lib/python3/dist-packages/requests/models.py", line 737, in content
self._content = bytes().join(self.iter_content(CONTENT_CHUNK_SIZE)) or bytes()
File "/usr/lib/python3/dist-packages/requests/models.py", line 667, in generate
raise ConnectionError(e)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='dt.odps.aliyun.com', port=80): Read timed out.
packet_write_wait: Connection to 122.121.122.121 port 22: Broken pipe
This is the error what I got. Can you suggest what is the problem ?
The read timeout is the timeout on waiting to read data. Usually, if the server fails to send a byte seconds after the last byte, a read timeout error will be raised.
This happens because of the server couldn`t read the file within the specified timeout period.
Here, the read timeout was set to 130 seconds, which is less if your file size is very high.
Please increase the timeout limit from 130 seconds to 500 seconds, i.e options.read_timeout=130 to options.read_timeout=500
It would resolve your problem, at the same time minimize the retry times from 7 to 3,
i.e options.retry_times=7 to options.retry_times=3
This error is usually caused by network issue.
Execute curl endpoint URL in a terminal. If it returns immediately with something like this:
<?xml version="1.0" encoding="UTF-8"?>
<Error>
<Code>NoSuchObject</Code>
<Message><![CDATA[Unknown http request location: /]]></Message>
<RequestId>5E5CC9526283FEC94F19DAAE</RequestId>
<HostId>localhost</HostId>
</Error>
Then the endpoint URL is reachable. But if it hangs, then you should check if you are using the right endpoint URL.
Since MaxCompute (ODPS) has public and private endpoint, it could be confusing sometimes.

Resources