Uploading CSV files to Fusion Tables through Python - python-3.x

I am trying to grab data from looker and insert it directly into Google Fusion Tables using the MediaFileUpload so as to not download any files and upload from memory. My current code below returns a TypeError. Any help would be appreciated. Thanks!
Error returned to me:
Traceback (most recent call last):
File "csvpython.py", line 96, in <module>
main()
File "csvpython.py", line 88, in main
media = MediaFileUpload(dataq, mimetype='application/octet-stream', resumable=True)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/oauth2client/_helpers.py", line 133, in positional_wrapper
return wrapped(*args, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/googleapiclient/http.py", line 548, in __init__
fd = open(self._filename, 'rb')
TypeError: expected str, bytes or os.PathLike object, not NoneType
Code in question:
for x, y, z in zip(look, destination, fusion):
look_data = lc.run_look(x)
df = pd.DataFrame(look_data)
stream = io.StringIO()
dataq = df.to_csv(path_or_buf=stream, sep=";", index=False)
media = MediaFileUpload(dataq, mimetype='application/octet-stream', resumable=True)
replace = ftserv.table().replaceRows(tableId=z, media_body=media, startLine=None, isStrict=False, encoding='UTF-8', media_mime_type='application/octet-stream', delimiter=';', endLine=None).execute()
After switching dataq to stream in MediaFileUpload, I have had the following returned to me:
Traceback (most recent call last):
File "quicktestbackup.py", line 96, in <module>
main()
File "quicktestbackup.py", line 88, in main
media = MediaFileUpload(stream, mimetype='application/octet-stream', resumable=True)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/oauth2client/_helpers.py", line 133, in positional_wrapper
return wrapped(*args, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/googleapiclient/http.py", line 548, in __init__
fd = open(self._filename, 'rb')
TypeError: expected str, bytes or os.PathLike object, not _io.StringIO

DataFrame.to_csv is a void method and any side effects from calling it are passed to stream and not dataq. That is, dataq is NoneType and has no data - your CSV data is in stream.
When you construct the media file from the io object, you need to feed it the data from the stream (and not the stream itself), thus its getvalue() method is needed.
df.to_csv(path_or_buf=stream, ...)
media = MediaFileUpload(stream.getvalue(), ...)
The call to FusionTables looks to be perfectly valid.

Related

WSQ files not opening with Pillow/wsq when using joblib.Parallel

I am trying to preprocess large amounts of WSQ images for model training using both the Pillow and wsq libraries. To speed up my code, I am trying to use Parallel but this causes an UnidentifiedImageError.
I verified that the files are there where they should be, and that the function runs without errors when used in a regular for-loop. Other files (eg csv files) can be opened inside the function without errors, so I presume that the error lies with the combination of Parallel and Pillow/wsq. All libraries are up to date. As I am just starting out with Pillow and multiprocessing, I have no idea yet on how to fix this and any help would be highly appreciated.
Code:
from joblib import Parallel, delayed
from PIL import Image
import multiprocessing
import wsq
import numpy as np
def process_image(i):
path = "/home/user/project/wsq/image_"+str(i)+".wsq"
img = np.array(Image.open(path))
#some preprocessing, saving as npz
output_path = "/home/user/project/npz/image_"+str(i)+".npz"
np.savez_compressed(output_path, img)
return None
inputs = range(100000)
num_cores = multiprocessing.cpu_count()
Parallel(n_jobs=num_cores)(delayed(process_image)(i) for i in inputs)
Output:
joblib.externals.loky.process_executor._RemoteTraceback:
"""
Traceback (most recent call last):
File "/home/user/.local/lib/python3.8/site-packages/joblib/externals/loky/process_executor.py", line 431, in _process_worker
r = call_item()
File "/home/user/.local/lib/python3.8/site-packages/joblib/externals/loky/process_executor.py", line 285, in __call__
return self.fn(*self.args, **self.kwargs)
File "/home/user/.local/lib/python3.8/site-packages/joblib/_parallel_backends.py", line 595, in __call__
return self.func(*args, **kwargs)
File "/home/user/.local/lib/python3.8/site-packages/joblib/parallel.py", line 262, in __call__
return [func(*args, **kwargs)
File "/home/user/.local/lib/python3.8/site-packages/joblib/parallel.py", line 262, in <listcomp>
return [func(*args, **kwargs)
File "preprocess_images.py", line 9, in process_image
img = np.array(Image.open(path))
File "/home/user/.local/lib/python3.8/site-packages/PIL/Image.py", line 2967, in open
raise UnidentifiedImageError(
PIL.UnidentifiedImageError: cannot identify image file '/home/user/project/wsq/image_1.wsq'
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "preprocess_images.py", line 18, in <module>
Parallel(n_jobs=num_cores)(delayed(process_image)(i) for i in inputs)
File "/home/user/.local/lib/python3.8/site-packages/joblib/parallel.py", line 1054, in __call__
self.retrieve()
File "/home/user/.local/lib/python3.8/site-packages/joblib/parallel.py", line 933, in retrieve
self._output.extend(job.get(timeout=self.timeout))
File "/home/user/.local/lib/python3.8/site-packages/joblib/_parallel_backends.py", line 542, in wrap_future_result
return future.result(timeout=timeout)
File "/usr/lib/python3.8/concurrent/futures/_base.py", line 439, in result
return self.__get_result()
File "/usr/lib/python3.8/concurrent/futures/_base.py", line 388, in __get_result
raise self._exception
PIL.UnidentifiedImageError: cannot identify image file '/home/user/project/wsq/image_1.wsq'

unable take input from a text file in python crawler

I have created a basic crawler in python, I want to take input from a text file.
I used open/raw_input but there was an error.
When I used input("") function it is prompting for input and was working fine.
The problem only with reading a file
import re
import urllib.request
url = open('input.txt', 'r')
data = urllib.request.urlopen(url).read()
data1 = data.decode("utf8")
print(data1)
file =open('output.txt' , 'w')
file.write(data1)
file.close()
error output below.
Traceback (most recent call last):
File "scrape.py", line 8, in <module>
data = urllib.request.urlopen(url).read()
File "/usr/lib/python3.6/urllib/request.py", line 223, in urlopen
return opener.open(url, data, timeout)
File "/usr/lib/python3.6/urllib/request.py", line 518, in open
protocol = req.type
AttributeError: '_io.TextIOWrapper' object has no attribute 'type'
the method open returns a file object, and not the content of the file as a string. if you want url to contain the content as a string, change the line to:
url = open('input.txt', 'r').read()

How to response with PIL image in Cherrypy dynamically (Python3)?

It seems the task is easy but...
I have simple PIL.Image object. How to make Cherrypy response with this image dynamically?
def get_image(self, data_id):
cherrypy.response.headers['Content-Type'] = 'image/png'
img = PIL.Image.frombytes(...)
buffer = io.StringIO()
img.save(buffer, 'PNG')
return buffer.getvalue()
This code gives me:
500 Internal Server Error
The server encountered an unexpected condition which prevented it from fulfilling the request.
Traceback (most recent call last):
File "C:\Users\Serge\AppData\Local\Programs\Python\Python36\lib\site-packages\cherrypy\_cprequest.py", line 631, in respond
self._do_respond(path_info)
File "C:\Users\Serge\AppData\Local\Programs\Python\Python36\lib\site-packages\cherrypy\_cprequest.py", line 690, in _do_respond
response.body = self.handler()
File "C:\Users\Serge\AppData\Local\Programs\Python\Python36\lib\site-packages\cherrypy\_cpdispatch.py", line 60, in __call__
return self.callable(*self.args, **self.kwargs)
File "D:\Dev\Bf\webapp\controllers\calculation.py", line 69, in get_image
img.save(buffer, 'PNG')
File "C:\Users\Serge\AppData\Local\Programs\Python\Python36\lib\site-packages\PIL\Image.py", line 1930, in save
save_handler(self, fp, filename)
File "C:\Users\Serge\AppData\Local\Programs\Python\Python36\lib\site-packages\PIL\PngImagePlugin.py", line 731, in _save
fp.write(_MAGIC)
TypeError: string argument expected, got 'bytes'
Can someone help me please?
Use io.BytesIO() instead of io.StringIO(). (From this answer.)

exchangelib get task from task queryset MIME conversion is not supported for this item

I am trying to access an object
through a queryset created with exchangelib however I get an error MIME CONVERSION IS NOT SUPPORTED FOR THIS ITEM, I don't know what it means. I have tried the same code with calendar items and I have no problem whatsoever. thanks
from exchangelib import Account, Credentials, DELEGATE
credentials = Credentials(username='BUREAU\\pepe', password='pepe')
pepe_account = Account(
primary_smtp_address='pepe#office.com',
credentials=credentials,
autodiscover=True,
access_type=DELEGATE)
tasks = pepe_account.tasks.filter()
print(tasks) -- Works
for task in tasks:
print(task)
The iteration fails, instead of print(task) I also tried pass and I get the same message.
Traceback (most recent call last):
File "test.py", line 13, in <module>
for task in tasks:
File "/home/pepe/.local/lib/python3.5/site-packages/exchangelib/queryset.py", line 197, in __iter__
for val in result_formatter(self._query()):
File "/home/pepe/.local/lib/python3.5/site-packages/exchangelib/queryset.py", line 272, in _as_items
for i in iterable:
File "/home/pepe/.local/lib/python3.5/site-packages/exchangelib/account.py", line 393, in fetch
for i in GetItem(account=self).call(items=ids, additional_fields=additional_fields, shape=IdOnly):
File "/home/pepe/.local/lib/python3.5/site-packages/exchangelib/services.py", line 456, in _pool_requests
for elem in elems:
File "/home/pepe/.local/lib/python3.5/site-packages/exchangelib/services.py", line 283, in _get_elements_in_response
container_or_exc = self._get_element_container(message=msg, name=self.element_container_name)
File "/home/pepe/.local/lib/python3.5/site-packages/exchangelib/services.py", line 256, in _get_element_container
self._raise_errors(code=response_code, text=msg_text, msg_xml=msg_xml)
File "/home/pepe/.local/lib/python3.5/site-packages/exchangelib/services.py", line 273, in _raise_errors
raise vars(errors)[code](text)
exchangelib.errors.ErrorUnsupportedMimeConversion: MIME conversion is not supported for this item type.

sqlalchemy insert - string argument without an encoding

The code below worked when using Python 2.7, but raises a StatementError when using Python 3.5. I haven't found a good explanation for this online yet.
Why doesn't sqlalchemy accept simple Python 3 string objects in this situation? Is there a better way to insert rows into a table?
from sqlalchemy import Table, MetaData, create_engine
import json
def add_site(site_id):
engine = create_engine('mysql+pymysql://root:password#localhost/database_name', encoding='utf8', convert_unicode=True)
metadata = MetaData()
conn = engine.connect()
table_name = Table('table_name', metadata, autoload=True, autoload_with=engine)
site_name = 'Buffalo, NY'
p_profile = {"0": 300, "1": 500, "2": 100}
conn.execute(table_name.insert().values(updated=True,
site_name=site_name,
site_id=site_id,
p_profile=json.dumps(p_profile)))
add_site(121)
EDIT The table was previously created with this function:
def create_table():
engine = create_engine('mysql+pymysql://root:password#localhost/database_name')
metadata = MetaData()
# Create table for updating sites.
table_name = Table('table_name', metadata,
Column('id', Integer, Sequence('user_id_seq'), primary_key=True),
Column('updated', Boolean),
Column('site_name', BLOB),
Column('site_id', SMALLINT),
Column('p_profile', BLOB))
metadata.create_all(engine)
EDIT Full error:
>>> scd.add_site(121)
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/sqlalchemy/engine/base.py", line 1073, in _execute_context
context = constructor(dialect, self, conn, *args)
File "/usr/local/lib/python3.5/dist-packages/sqlalchemy/engine/default.py", line 610, in _init_compiled
for key in compiled_params
File "/usr/local/lib/python3.5/dist-packages/sqlalchemy/engine/default.py", line 610, in <genexpr>
for key in compiled_params
File "/usr/local/lib/python3.5/dist-packages/sqlalchemy/sql/sqltypes.py", line 834, in process
return DBAPIBinary(value)
File "/usr/local/lib/python3.5/dist-packages/pymysql/__init__.py", line 79, in Binary
return bytes(x)
TypeError: string argument without an encoding
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/user1/Desktop/server_algorithm/database_tools.py", line 194, in add_site
failed_acks=json.dumps(p_profile)))
File "/usr/local/lib/python3.5/dist-packages/sqlalchemy/engine/base.py", line 914, in execute
return meth(self, multiparams, params)
File "/usr/local/lib/python3.5/dist-packages/sqlalchemy/sql/elements.py", line 323, in _execute_on_connection
return connection._execute_clauseelement(self, multiparams, params)
File "/usr/local/lib/python3.5/dist-packages/sqlalchemy/engine/base.py", line 1010, in _execute_clauseelement
compiled_sql, distilled_params
File "/usr/local/lib/python3.5/dist-packages/sqlalchemy/engine/base.py", line 1078, in _execute_context
None, None)
File "/usr/local/lib/python3.5/dist-packages/sqlalchemy/engine/base.py", line 1341, in _handle_dbapi_exception
exc_info
File "/usr/local/lib/python3.5/dist-packages/sqlalchemy/util/compat.py", line 202, in raise_from_cause
reraise(type(exception), exception, tb=exc_tb, cause=cause)
File "/usr/local/lib/python3.5/dist-packages/sqlalchemy/util/compat.py", line 185, in reraise
raise value.with_traceback(tb)
File "/usr/local/lib/python3.5/dist-packages/sqlalchemy/engine/base.py", line 1073, in _execute_context
context = constructor(dialect, self, conn, *args)
File "/usr/local/lib/python3.5/dist-packages/sqlalchemy/engine/default.py", line 610, in _init_compiled
for key in compiled_params
File "/usr/local/lib/python3.5/dist-packages/sqlalchemy/engine/default.py", line 610, in <genexpr>
for key in compiled_params
File "/usr/local/lib/python3.5/dist-packages/sqlalchemy/sql/sqltypes.py", line 834, in process
return DBAPIBinary(value)
File "/usr/local/lib/python3.5/dist-packages/pymysql/__init__.py", line 79, in Binary
return bytes(x)
sqlalchemy.exc.StatementError: (builtins.TypeError) string argument without an encoding [SQL: 'INSERT INTO table_name (updated, site_name, site_id, p_profile) VALUES (%(updated)s, %(site_name)s, %(site_id)s, %(p_profile)s)']
As univerio mentioned, the solution was to encode the string as follows:
conn.execute(table_name.insert().values(updated=True,
site_name=site_name,
site_id=site_id,
p_profile=bytes(json.dumps(p_profile), 'utf8')))
BLOBs require binary data, so we need bytes in Python 3 and str in Python 2, since Python 2 strings are sequences of bytes.
If we want to use Python 3 str, we need to use TEXT instead of BLOB.
You simply just need to convert your string to a byte string ex:
site_name=str.encode(site_name),
site_id=site_id,
p_profile=json.dumps(p_profile)))```
or
```site_name = b'Buffalo, NY'```

Resources