not in query and select one field from second collection - node.js

My requirement is to count all the data whose particular id is not in reference collection. The equivalent SQL query would go as below:
select count(*) from tbl1 where tbl.arr.id not in (select id from tbl2)
I've tried as below, but got stuck up on fetching single field i.e. id from 2nd query.
db.coll1.find(
{$not:
{"arr.id":
{$in:
{db.coll2.find()}//how would I fetch a single column from
//2nd coll2
}
}
}
).count()
Also, Please note that arr.id is an ObjectId stored in collection coll1 and same will go with collection coll2. Should special care be taken while fetching the id like say ObjectId(id)?
Update - I am using mongo db version 3.0.9

I had to use $nin to check for not in condition and get the array in a different format as the version of mongodb was 3.0.9. Below is how I did it.
db.coll1.find({"arr.id":{$nin:[db.coll2.find({},["id"])]}}).count()
For mongodb v>=3.2 it would be as below
db.coll1.find({"arr.id":{$nin:[db.coll2.find({},"id")]}}).count()

Related

Deleting several rows in table of DB

I am using code-generator. I have two tables. In one table I have user_id an several docs of this user_id. In second table a have docs of this user_id, but without user_id and I have to delete these docs.
Please help!
I'm assuming you have some criteria about the user whose documents you want to delete, just not the foreign key? Just use a semi-join, which you can use in DELETE statements as well.
Assuming your search criteria is something like the username:
In SQL
DELETE FROM docs
WHERE docs.user_id IN (
SELECT user_id
FROM user
WHERE username = ?
)
In Java
// Assuming the usual static import:
import static org.jooq.impl.DSL.*;
// Then write:
ctx.deleteFrom(DOCS)
.where(DOCS.USER_ID.in(
select(USER.USER_ID)
.from(USER)
.where(USER.USERNAME.eq(username))
))
.execute();

Fetch jsonb column of postgres db

I am using node-postgres to select and insert data into postgres. I have some column of jsonb type which I am fetching from db by using below query
getEmployee() {
return SELECT empId, empData FROM employee WHERE empId = $1;
}
where empData is jsonb type of column. Below is code snippet which use above query.
const employee = await DBService.query(pgObj.getEmployee(), [empId]);
when I am trying to get empData from employee I am getting empty value.
const { empData } = employee;
I am not sure what I am missing here. Is this the correct way to fetch josnb column of postgreas db in nodejs?
Are you sure empdata is even populated, in the database? Maybe it's empty.
Also, what are the jsonb fields of empdata?
To get the actual sub-fields of empdata, you need the ->> operator. eg:
get the whole json object as text
SELECT empId, empData::text
FROM employee where empId = $1
get individual attributes
SELECT empId, empData->>annual_pay as salary
FROM employee WHERE empId = $1;
etc...
You can also try
Have a look here: https://kb.objectrocket.com/postgresql/how-to-query-a-postgres-jsonb-column-1433
I haven't tried these out, I'm not in front of postgres right now.

Query to get all Cosmos DB documents referenced by another

Assume I have the following Cosmos DB container with the possible doc type partitions:
{
"id": <string>,
"partitionKey": <string>, // Always "item"
"name": <string>
}
{
"id": <string>,
"partitionKey": <string>, // Always "group"
"items": <array[string]> // Always an array of ids for items in the "item" partition
}
I have the id of a "group" document, but I do not have the document itself. What I would like to do is perform a query which gives me all "item" documents referenced by the "group" document.
I know I can perform two queries: 1) Retrieve the "group" document, 2) Perform a query with IN clause on the "item" partition.
As I don't care about the "group" document other than getting the list of ids, is it possible to construct a single query to get me all the "item" documents I want with just the "group" document id?
You'll need to perform two queries, as there are no joins between separate documents. Even though there is support for subqueries, only correlated subqueries are currently supported (meaning, the inner subquery is referencing values from the outer query). Non-correlated subqueries are what you'd need.
Note that, even though you don't want all of the group document, you don't need to retrieve the entire document. You can project just the items property, which can then be used in your 2nd query, with something like array_contains(). Something like:
SELECT VALUE g.items
FROM g
WHERE g.id="1"
AND g.partitionKey="group"
SELECT VALUE i.name
FROM i
WHERE array_contains(<items-from-prior-query>,i.id)
AND i.partitionKey="item"
This documentation page clarifies the two subquery types and support for only correlated subqueries.

Azure CosmosDB: how to ORDER BY id?

Using a vanilla CosmosDB collection (all default), adding documents like this:
{
"id": "3",
"name": "Hannah"
}
I would like to retrieve records ordered by id, like this:
SELECT c.id FROM c
ORDER BY c.id
This give me the error Order-by item requires a range index to be defined on the corresponding index path.
I expect this is because /id is hash indexed and not range indexed. I've tried to change the Indexing Policy in various ways, but any change I make which would touch / or /id gets wiped when I save.
How can I retrieve documents ordered by ID?
The best way to do this is to store a duplicate property e.g. id2 that has the same value of id, and is indexed using a range index, then use that for sorting, i.e. query for SELECT * FROM c ORDER BY c.id2.
PS: The reason this is not supported is because id is part of a composite index (which is on partition key and row key; id is the row key part) The Cosmos DB team is working on a change that will allow sorting by id.
EDIT: new collections now support ORDER BY c.id as of 7/12/19
I found this page CosmosDB Indexing Policies , which has the below Note that may be helpful:
Azure Cosmos DB returns an error when a query uses ORDER BY but
doesn't have a Range index against the queried path with the maximum
precision.
Some other information from elsewhere in the document:
Range supports efficient equality queries, range queries (using >, <,
>=, <=, !=), and ORDER BY queries. ORDER By queries by default also require maximum index precision (-1). The data type can be String or
Number.
Some guidance on types of queries assisted by Range queries:
Range Range over /prop/? (or /) can be used to serve the following
queries efficiently:
SELECT FROM collection c WHERE c.prop = "value"
SELECT FROM collection c WHERE c.prop > 5
SELECT FROM collection c ORDER BY c.prop
And a code example from the docs also:
var rangeDefault = new DocumentCollection { Id = "rangeCollection" };
// Override the default policy for strings to Range indexing and "max" (-1) precision
rangeDefault.IndexingPolicy = new IndexingPolicy(new RangeIndex(DataType.String) { Precision = -1 });
await client.CreateDocumentCollectionAsync(UriFactory.CreateDatabaseUri("db"), rangeDefault);
Hope this helps,
J

Index multiple MongoDB fields, make only one unique

I've got a MongoDB database of metadata for about 300,000 photos. Each has a native unique ID that needs to be unique to protect against duplication insertions. It also has a time stamp.
I frequently need to run aggregate queries to see how many photos I have for each day, so I also have a date field in the format YYYY-MM-DD. This is obviously not unique.
Right now I only have an index on the id property, like so (using the Node driver):
collection.ensureIndex(
{ id:1 },
{ unique:true, dropDups: true },
function(err, indexName) { /* etc etc */ }
);
The group query for getting the photos by date takes quite a long time, as one can imagine:
collection.group(
{ date: 1 },
{},
{ count: 0 },
function ( curr, result ) {
result.count++;
},
function(err, grouped) { /* etc etc */ }
);
I've read through the indexing strategy, and I think I need to also index the date property. But I don't want to make it unique, of course (though I suppose it's fine to make it unique in combine with the unique id). Should I do a regular compound index, or can I chain the .ensureIndex() function and only specify uniqueness for the id field?
MongoDB does not have "mixed" type indexes which can be partially unique. On the other hand why don't you use _id instead of your id field if possible. It's already indexed and unique by definition so it will prevent you from inserting duplicates.
Mongo can only use a single index in a query clause - important to consider when creating indexes. For this particular query and requirements I would suggest to have a separate unique index on id field which you would get if you use _id. Additionally, you can create a non-unique index on date field only. If you run query like this:
db.collection.find({"date": "01/02/2013"}).count();
Mongo will be able to use index only to answer the query (covered index query) which is the best performance you can get.
Note that Mongo won't be able to use compound index on (id, date) if you are searching by date only. You query has to match index prefix first, i.e. if you search by id then (id, date) index can be used.
Another option is to pre aggregate in the schema itself. Whenever you insert a photo you can increment this counter. This way you don't need to run any aggregation jobs. You can also run some tests to determine if this approach is more performant than aggregation.

Resources