mysql.connector compatibility issues with pandas "pd.read_sql" - python-3.x

When I try reading the data from a mysql database using pandas.read_sql() it gives me the error:
UserWarning: pandas only supports SQLAlchemy connectable (engine/connection) or database string URI or sqlite3 DBAPI2 connection. Other DBAPI2 objects are not tested. Please consider using SQLAlchemy. df = pd.read_sql_query(q, mydb)
When printing the dataframe it will still contain the correct data however I am worried that this may cause issues in the future.
Is there anyway to overcome this? and is this error any significance.
I did read into SQLAlchemy engines but I am unsure of how to use an engine with mysql.connector and dont fully understand the purpose of the engine.

Related

Import CSV and write into SQL Server database with sequalize node.js

I wrote a script that iterates over many .CSV files and (should) send the data with sequelize to a SQL Server database. I use csv-parse module to read and 'cast' the data, meaning that it reduces the data to its smallest datatype.
ISSUE: In order to send the data to the DB, I need to define a Model object, that defines the format and datatypes of the table. I want this to be automated and based on the 'casted' csv.
Additional issue: I want to be able to upsert the table too eventually.
I have accomplished to:
Connect to the DB with sequelize and verify the connection (also did a test query)
Read the CSVs and 'cast' the data. It recognises integers, floats and even dates. It also accepts the header that I parsed to the csv module.
I am a novice with js and I feel that I am not grasping something fundamental here. In python and pandas this would be a simple single line. All help would be much appreciated!

jooq timestamp arithmetic Cannot convert from Integer to LocalDateTime

With jooq 3.13.x, we are using
Field<Instant> midPointDueTime = TICKET.READY.plus(TICKET.DUE.minus(TICKET.READY).div(2));
where READY and DUE fields are of type java.time.Instant. They are DATETIME fields in the database (normally java.sql.Timestamp) but are converted to Instant with a javax.persistence.AttributeConverter. The database in question is Informix, but we are using the open source version of jooq for now with the DEFAULT dialect and trying to avoid cases where things would deviate from standard SQL syntax.
From that field declaration jooq 3.13.x creates the following SQL snippet which works as expected
TICKET.READY + ((TICKET.DUE - TICKET.READY) / 2))
This is the expected DATETIME arithmetic. We are looking for a timestamp halfway between READY and DUE.
But jooq 3.14 or 3.15 both throw a runtime exception.
org.jooq.exception.DataTypeException: Cannot convert from 2 (class java.lang.Integer) to class java.time.LocalDateTime
No SQL is generated, so I don't think this is an Informix compatibility issue. The error happens before any SQL statement is logged.
Is this possibly a bug, or is there something else I can do to achieve the same date arithmetic result?
Regarding dialects
From the SQLDialect.DEFAULT javadoc:
This dialect is chosen in the absence of a more explicit dialect. It is not intended to be used with any actual database as it may combine dialect-specific things from various dialects.
The purpose of this dialect is to be used in cases where no other dialect is available, and debug log information is needed, e.g. when writing:
DSL.abs(1).toString(); // This uses DEFAULT to render the abs(1) function call
You shouldn't use this dialect in any production situation, its behaviour may change at any time for various reasons. It's certainly not integration tested with any RDBMS. Use the SQLDialect.INFORMIX dialect instead
After further research
Thanks to your detailed bug report hree: https://github.com/jOOQ/jOOQ/issues/12544, it can be seen that this is a regression that will be fixed in jOOQ 3.16.0 and 3.15.4

Bulk import of graph data with ArangoDB java driver

I have a question regarding bulk import when working with a graph layer of ArangoDB and its java driver. I'm using Arango 3.4.5 with java driver 5.0.0.
In a document layer, it's possible to use ArangoCollection.importDocuments to insert several documents at once. However, for the collections of the graph layer, the ArangoEdgeCollection and the ArangoVertexCollection, the importDocuments function (or a corresponding importVertices/importEdges function) does not exist. So, if I want to pursue a bulk import of my graph data, I have to ignore the graph layer and use the importDocuments function on vertex collections, *_ELEMENT-PROPERTIES, *_ELEMENT-HAS-PROPERTIES, and edge collections separately by myself.
Furthermore, when the edge collections already exist in the database, it's even not possible to perform a bulk import, because the existing collection is already defined as an edge collection.
Or maybe it's not true what I'm writing and I overlooked something essential?
If not, is there a reason why the bulk import is not implemented for the graph layer? Or is a graph bulk import just among items of a nice-to-have list which hasn't been implemented yet?
Based on my findings described above, the bulk import of graph data with java driver is IMO not possible if the graph collections already exist (because of the edge collections) (?). It would be possible to carry out the bulk import only if we created edge collections from scratch as ordinary collections, which, however, already smells of necessity to sequentially write my own basic graph layer (which I don't want to do, of course).
I guess another way is then the import of JSON data which I haven't analyzed much so far because it seems to me inconvenient when I need to manipulate (or create) the data with java before storing them. Therefore, I would really like to work with the java driver.
Thank you very much for any reply, opinion or corrections.

Python 3 : Storing Data without loading it into memory

I am currently building a flask app that will have lots of data that I think I cannot load into memory. I have searched many places, and have found the SQL seems to be a good solution. Sadly I cannot use SQL for this project due to some limitations of SQL.
My project consists many entries of
database[lookupkey1][lookupkey2]...and more lookup keys
My current plan is to override __getitem__ and __setitem__ and __delitem__ and replace them with calls to the database. Is their any kind of database that can store large amounts of maps/dictionaries like
{"spam":{"bar":["foo","test"],"foo":["bar"]}}
I am also currently using JSON to save data, so it would be appreciated if the database had a easy way to migrate my current database.
Sorry that I'm not very good at writing stack overflow questions.
Most document-oriented DBs like MongoDB would allow you to save data as nested dict-list-like objects and query them using their keys and indexes.
P.S. Accessing such a DB through a Python's dict accessors is a bad idea as it would produce a redundant DB query for each step which is highly ineffective and may lead to performance problems. Try looking at ORM for a DB you choose as most ORMs would allow you to access document-oriented DB's data in a way similar to accessing dicts and lists.

SQL indices with Database.Persist (Yesod web framework)

Database.Persist seems to be index-agnostic. This is okay, I can create my own indices, but generic SQL migration seems to create and drop tables when adding/removing fields. This has the effect of dropping the index as well.
Is there a recommended way to make sure they survive database migrations?
Only the SQLite3 backend should be dropping tables, PostgreSQL and MySQL both provide powerful enough ALTER TABLE commands to avoid that. So indices should only be lost for SQLite3. If you're using SQLite3 in production (not really recommended), you have two choices:
Disable automatic migrations and handle the schema yourself.
Add some code after the migrations are run to replace any missing indices.

Resources