Issue with PrestoDB & MongoDB - presto

I having some strange issues querying mongodb from presto CLI. I have my mongodb.properties set and connecting to 3 different databases as shown below.
connector.name=mongodb
mongodb.seeds=172.23.0.7:27017
mongodb.schema-collection=stage,configuration,hub
mongodb.credentials=<username>:<password>#stage,<username>:<password>#hub,<username>:<password>#configuration
None of the queries including show columns from <collection> or select count(*) from <collection> is not working on stage or hub and for collections in configuration too.
Question is, does Presto support these kind of queries on MongoDB. If yes, what could be the problem with my configuration or queries. Our intention is to compare data from Oracle to MongoDB.
Appreciate your help.

This is an old post, but I hope this is still useful for future users. You shouldn't be setting the mongodb.schema-collection as such. This property is meant to point to the mongo collection which describes the schema of other collections, typically defaulting to _schema when it exists. This is covered in the docs of most presto distributions, including prestodb.
This does not allow you to control which collections Presto will have access to, this must be done elsewhere (e.g. when setting up presto's user in the MongoDB cluster). Once correctly set up, Presto will be able to perform queries such as the ones in your example in all the collections it has access to.

Related

Shopware 6 partitioning

Has anyone had any experience with database partitioning? We already have a lot of data and queries on it are already starting to slow down. Maybe someone has some examples? These are tables related to orders.
Shopware, since version 6.4.12.0, allows the use of database clusters, see the relevant documentation. You will have to set up a number read-only nodes first. The load of reading data will then be distributed among the read-only nodes while write operations are restricted to the primary node.
Note that in a cluster setup you should also use a lock storage that compliments the setup.
Besides using a DB cluster you can also try to reduce the load of the db server.
The first thing you should enable the HTTP-Cache, still better to additionaly also set up a reverse cache like varnish. This will greatly decrease the number of requests that hit your webserver and thus your DB server as well.
Besides all those measures explained here should improve the overall performance of your shop as well as decreasing load on the DB.
Additionally you could use Elasticsearch, so that costly search requests won't hit the Database. And use a "real" MessageQueue, so that the messages are not stored in the Database. And use Redis instead of the database for the storage of performance critical information as is documented in the articles in this category of the official docs.
The impact of all those measures probably depends on your concrete project setup, so maybe you see in the DB locks something that hints to one of the points i mentioned previously, so that would be an indicator to start in that direction. E.g. if you see a lot of search related queries Elasticsearch would be a great start, but if you see a lot of DB load coming from writing/reading/deleting messages, then the MessageQueue might be a better starting point.
All in all when you use a DB cluster with a primary and multiple replicas and use the additional services i mentioned here your shop should be able to scale quite well without the need for partitioning the actual DB.

How do you perform queries without specifying shard key in mongodbapi and how do you query across partitions?

How do you perform queries without specifying shard key in mongodb api and how do you query across partitions?
In sql api the latter is enabled by setting EnableCrossPartitionQuery to true on the request but I'm not able to find anything like that for the mongodb api. And my queries that work on an unsharded collection now fails(queries that specify the shard key works as expected).
The queries fail indiscriminately of whether I use the AsQueryable extension syntax or the aggregation framework.
As I know, no such property similar to EnableCrossPartitionQuery in CosmosDB Mongo API. In fact, CosmosDB is an independent server implementation that does not directly align with MongoDB server versions and features.
CosmosDB supports a subset of the MongoDB API and translates requests into the CosmosDB SQL equivalent. CosmosDB has some different behaviours and results, particularly with their implementation of partitioning as compared to MongoDB's sharding. But the onus is on CosmosDB to improve their emulation of MongoDB.
Certainly, you could add feedback here to get official assistance or consider using MongoDB Atlas on Azure if you'd like full MongoDB feature support.
Hope it helps you.
Was confirmed a bug by the Product Group team! Will be fixed in first two weeks of september in case anyone runs into the same problems in the mean time.

Arangodb Accessing database A Collection in database B

I am trying to access one database collection in a different database. Is it possible in arangodb.
Regards,
Sajeev
No, its not possible.
ArangoDB strictly separates databases, so AQL doesn't know about databases in first place, its concept is below the database layer.
As we also discussed in the github issue we actually don't intend to implement support for this.

Where to use Neo4j

I'm actually trying to learn new things...
I used SQL for a long time, using MySQL and recently discovered document-oriented databases.
I came across graph-databases & Neo4j and want to try it through NodeJS but I really don't get the point.
Should I use Neo4j coupled with another DB? Like storing my data into MySQL & relationships in Neo4j?
Or may I use Neo4j to store data (like posts)?
Neo4j is often used as the primary database, see https://github.com/thingdom/node-neo4j for a node.js driver. Also, depending on your use case, you can use it with MySQL in different scenarios for complex queries that take a long time in MySQL like recommendations and other path queries, see http://docs.neo4j.org/chunked/snapshot/data-modeling-examples.html for some interesting starting examples.
/peter

SubSonic-based app that connects to multiple databases

I currently developed an app that connects to SQL Server 2005 database, so my DAL objects where generated using information from that DB.
It will also be possible to connect to an Oracle and MySQL db, all with the same table structures (aside from the normal differences in fields, such as varbinary(max) in SQL Server and BLOB in Oracle, and so on). For this purpose, I already defined multiple connection strings and multiple SubSonic providers for the different DB's the app will run on.
My question is, if I generated my objects using a SQL Server database, should the generated objects work transparently with the other DB's or do I need to generate a different DAL for each database engine I use? Should I be aware of any possible bugs I may encounter while performing these operations?
Thanks in advance for any advice on this issue.
I'm using SubSonic 2.2 by the way....
From what I've been able to test so far, I can't see an easy way to achieve what I'm trying to do.
The ideal situation for me would have been to generate SubSonic objects using SQL Server for example, and just be able to switch dynamically to MySQL by just creating at runtime the correct Provider for it along with its connection string. I got to a point where my app would correctly connect from SQL Server to a MySQL DB, but there's a point where the app fails since SubSonic internally generates queries of the form
SELECT * FROM dbo.MyTable
which MySQL doesn't support obviously. I also noticed queries that enclosed table names with brackets ([]), so it seems that there are a number of factors that would limit the use of one Provider along multiple DB engines.
I guess my only other option is to sort it out with multiple generated providers, although I must admit it does not make me comfortable knowing that I'll have N copies of basically the same classes along my project.
I would really love to hear from anyone else if they've had similar experiences. I'll be sure to post my results once I get everything sorted out and working for my project.
Has any of this changed in 3.0? This would definitely be a worthy reason for me to upgrade if life is any easier on this matter...

Resources