Searching parent id in cloudant - couchdb

I have a Cloudant DB with the following structure:
{id: 1, resource:”john doe”, manager: “john smith”, amount: 13}
{id: 2, resource:”mary doe”, manager: “john smith”, amount: 3}
{id: 3, resource:”john smith”, manager: “peter doe”, amount: 10}
I needed a query to return the sum of amount, so I've built a query with emit(doc.manager, doc.amount) which returns
{"rows":[
{"key":"john smith","value":16},
{"key":"peter doe","value":10}]}
It is working like a charm. However I need the manager ID along with Manager name. The result I am looking for is:
{"rows":[
{"key":{"john smith",3},"value":16},
{"key":{"peter doe",null},"value":10}]}
How should I build a map view to search the parent ID?
Thanks,
Erik

Unfortunately I don't think there's a way to do exactly what you want in one query. Assuming you have the following three documents in your database:
{
"_id": "1",
"resource": "john doe",
"manager": "john smith",
"amount": 13
}
--
{
"_id": "2",
"resource": "mary doe",
"manager": "john smith",
"amount": 3
}
--
{
"_id": "3",
"resource": "john smith",
"manager": "peter doe",
"amount": 10
}
The closest thing to what you want would be the following map function (which uses a compound key) and a _sum reduce:
function(doc) {
emit([doc.manager, doc._id], doc.amount);
}
This would give you the following results with reduce=false:
{"total_rows":3,"offset":0,"rows":[
{"id":"1","key":["john smith","1"],"value":13},
{"id":"2","key":["john smith","2"],"value":3},
{"id":"3","key":["peter doe","3"],"value":10}
]}
With reduce=true and group_level=1, you essentially get the same results as what you already have:
{"rows":[
{"key":["john smith"],"value":16},
{"key":["peter doe"],"value":10}
]}
If you instead do reduce=true and group=true (exact grouping) then you get the following results:
{"rows":[
{"key":["john smith","1"],"value":13},
{"key":["john smith","2"],"value":3},
{"key":["peter doe","3"],"value":10}
]}
Each unique combination of the manager and _id field is summed, which unfortunately doesn't give you what you want. To accomplish what you want to accomplish, I think your best but would be to sum up the values after querying the database.

Related

i can't query over populated children attributes

I am trying to query over populated children attributes using mongoose but it straight up doesn't work and will return empty arrays all the time.
even hardcoding right and existing information as values for the query would return empty arrays.
my schema is a business schema with a 1 to 1 relationship with user schema via the attribute createdBy. the user schema has an attribute name which I am trying to query on.
so if I make a query like this :
business.find({'createdBy.name': {$regex:"steve"}}).populate('createdBy')
the above will never return any documents. although, without the find condition, everything works fine.
Can I search by the name inside a populated child or not? all tutorials say this should work fine but it just doesn't.
EDIT : an example of what the record looks like :
{
"_id": "5fddedd00e8a7e069085964f",
"status": 6,
"addInfo": "",
"descProduit": "",
"createdBy": {
"_id": "5f99b1bea9ba194dec3bd6aa",
"status": 1,
"fcmtokens": [
],
"emailVerified": 1,
"phoneVerified": 0,
"userType": "User",
"name": "steve buschemi",
"firstName": "steve",
"lastName": "buschemi",
"tel": "",
"email": "steve#buschemi.com",
"register_token": "747f1e1e8fa1ecd2f1797bb402563198",
"createdAt": "2020-10-28T18:00:30.814Z",
"updatedAt": "2020-12-18T13:52:07.430Z",
"__v": 19,
"business": "5f99b1e101bfff39a8259457",
"credit": 635,
},
"createdAt": "2020-12-19T12:10:57.703Z",
"updatedAt": "2020-12-19T12:11:16.538Z",
"__v": 0,
"nid": "187"
}
It seems there is no way to filter parent documents by conditions on child documents:
From the official documentation:
In general, there is no way to make populate() filter stories based on properties of the story's author. For example, the below query won't return any results, even though author is populated.
const story = await Story.
findOne({ 'author.name': 'Ian Fleming' }).
populate('author').
exec();
story; // null
If you want to filter stories by their author's name, you should use denormalization.

SequelizeJS primaryKey related include

I am currently working on a REST API, working with SequelizeJS and Express.
I'm used to Django Rest Framework and I'm trying to find a similar function :
I have a table User and a table PhoneNumber.
I want to be able to return a user in JSON, including the list of the primarykeys of its phone numbers like this :
{
"firstName": "John",
"lastName": "Doe",
"phoneNumbers": [23, 34, 54],
}
Is there a way to do this simply and efficiently in sequelize or do I have to write functions that transform the fields like :
"phoneNumbers": [
{ "id": 23, "number": "XXXXXXXXXX" },
{ "id": 34, "number": "XXXXXXXXXX" },
{ "id": 54, "number": "XXXXXXXXXX" }
]
into what I have above ?
Thank you,
Giltho
Sequelize finder-methods accept and option attribute that lets you define which properties of a model it should query for. See http://docs.sequelizejs.com/manual/tutorial/querying.html#attributes
That works for joins too:
User.all({
include: [
{ model: Phonenumber, attributes: ['id']}
]
})
.then(function(users) {
})
will execute
SELECT user.*, phonenumber.id FROM user INNER JOIN phonenumber ON ...
But to turn the phonenumbers into an array of integers [1,2,...] in your api response, you'd still have to map the ids manually into an array, otherwise you get an array of [{"id": 1}, {"id": 2},...].
But actually i recommend not to do that. An array of objects is the more future-proof option than an array of integers. Because your api-client don't need to change anything if you decide some day to expand the phonenumber object with additional attributes.

Return distinct and sorted query in AQL

So I have two collections, one with cities with an array of postal codes as a property and one with postal codes and their latitude & longitude.
I want to return the cities closest to a coordinate. This is easy enough with a geo index but the issue I'm having is the same city being returned multiple times and some times it can be the 1st and 3rd closest because the postal code that I'm searching in bordering another city.
cities example data:
[
{
"_key": "30936019",
"_id": "cities/30936019",
"_rev": "30936019",
"countryCode": "US",
"label": "Colorado Springs, CO",
"name": "Colorado Springs",
"postalCodes": [
"80904",
"80927"
],
"region": "CO"
},
{
"_key": "30983621",
"_id": "cities/30983621",
"_rev": "30983621",
"countryCode": "US",
"label": "Manitou Springs, CO",
"name": "Manitou Springs",
"postalCodes": [
"80829"
],
"region": "CO"
}
]
postalCodes example data:
[
{
"_key": "32132856",
"_id": "postalCodes/32132856",
"_rev": "32132856",
"countryCode": "US",
"location": [
38.9286,
-104.6583
],
"postalCode": "80927"
},
{
"_key": "32147422",
"_id": "postalCodes/32147422",
"_rev": "32147422",
"countryCode": "US",
"location": [
38.8533,
-104.8595
],
"postalCode": "80904"
},
{
"_key": "32172144",
"_id": "postalCodes/32172144",
"_rev": "32172144",
"countryCode": "US",
"location": [
38.855,
-104.9058
],
"postalCode": "80829"
}
]
The following query works but as an ArangoDB newbie I'm wondering if there's a more efficient way to do this:
FOR p IN WITHIN(postalCodes, 38.8609, -104.8734, 30000, 'distance')
FOR c IN cities
FILTER p.postalCode IN c.postalCodes AND c.countryCode == p.countryCode
COLLECT close = c._id AGGREGATE distance = MIN(p.distance)
FOR c2 IN cities
FILTER c2._id == close
SORT distance
RETURN c2
The first FOR in the query will use the geo index and probably return few documents (just the postal codes around the specified location).
The second FOR will look up the city for each found postal code. This may be an issue, depending on whether there is an index present on cities.postalCodes and cities.countryCode. If not, then the second FOR has to do a full scan of the cities collection each time it is involved. This will be inefficient. It may therefore be create an index on the two attributes like this:
db.cities.ensureIndex({ type: "hash", fields: ["countryCode", "postalCodes[*]"] });
The third FOR can be removed entirely when not COLLECTing by c._id but by c:
FOR p IN WITHIN(postalCodes, 38.8609, -104.8734, 30000, 'distance')
FOR c IN cities
FILTER p.postalCode IN c.postalCodes AND c.countryCode == p.countryCode
COLLECT city = c AGGREGATE distance = MIN(p.distance)
SORT distance
RETURN city
This will shorten the query string, but it may not help efficiency much I think, as the third FOR will use the primary index to look up the city documents, which is O(1).
In general, when in doubt about a query using indexes, you can use db._explain(queryString) to show which indexes will be used by a query.

Nodejs check gender by name

I need to check gender by name. I have list of name in first form, such as: "Peter", "Anna" and etc. Its not very complicated, but application must return probability of gender,than name is not in first form, example "Peter" and "Petka" is equal. Maybe somebody knows good solution for NodeJS?
It is likely to fail quite often, and as pointed in comments, it may even offend some. That being said, there is an api that does exactly that : Genderize.io
It returns results like : {"name":"peter","gender":"male","probability":"0.99","count":796} You can also localize your query for more accuracy.
And their db is 177k names large, so it's probably your best bet.
EDIT :
To take the example you mention, here's what it returns for 'Petka' :
{
name: "petka",
gender: "female",
probability: "1.00",
count: 2
}
So I guess there's room for improvement.
or you can use name2gender.com with free 10000 api calls per month
response for 'peter':
{
"name" : "peter",
"gender" : "MALE",
"accuracy" : 98.53,
"samples" : 253705,
"country" : "WORLD",
"durationMs" : 0
}
more samples means more result of precision
You could also use https://veriocheck.com
They have API for name to gender lookup but they also provide name corrections along with it which we find more useful. So if the name was misspelled or incorrect, they provide corrections and then lookup correct gender.
It's better to use a professional API service like parser.name for this. You can post a name like Peter or Anna and you'll get back the gender of the names within milliseconds.
firstname: {
name: "Anna",
name_ascii: "Anna",
validated: true,
gender: "f",
gender_formatted: "female",
unisex: false,
gender_deviation: 0,
country_code: "US",
country_certainty: 31,
country_rank: 28,
alternative_countries: {
GB: 13,
PL: 8,
SE: 6
}
}
you can check https://genderapi.io
https://genderapi.io/api?name=peter;anna
{
"status": true,
"duration": "56ms",
"used_credits": 2,
"q": "peter;anna",
"names": [
{
"name": "peter",
"q": "peter",
"gender": "male",
"total_names": 4787,
"probability": 100
},
{
"name": "anna",
"q": "anna",
"gender": "female",
"total_names": 9609,
"probability": 100
}
]
}

How to find documents which match a subset of keys

Let's say we have the following data structure:
{
"name": "",
"tags": []
}
With the following example data:
{
"name": "Test1",
"tags": [
"Laptop",
"Smartphone",
"Tablet"
]
}
{
"name": "Test2",
"tags": [
"Computer",
"Laptop",
"Smartphone",
"Tablet"
]
}
{
"name": "Test3",
"tags": [
"Smartphone",
"Tablet"
]
}
Now I am trying to find:
Find all documents with with Smartphone AND Tablet in tags. This should return all documents.
I can't figure out how this works with couchdb. I tried to add the tags as keys and played with startkey / endkey with no luck.
Hope somebody can help me out.
Greetings,
Ben
I see two possible solutions. Why don't you have a view with a map function like:
function(doc) {
doc.tags && doc.tags.forEach(function(tag) {
emit(tag, null);
});
}
Now if you query this view with keys=["Smartphone", "Tablet"] you will get the following lines:
{id: "A", key: "Smartphone", value: null},
{id: "B", key: "Smartphone", value: null},
{id: "C", key: "Smartphone", value: null},
{id: "A", key: "Tablet", value: null},
{id: "B", key: "Tablet", value: null},
{id: "C", key: "Tablet", value: null}
Now you have to parse this response well on the client side, to filter out the ids which don't show up for all the keys of your query. In this case all the documents (A, B, C) show up, so this is your result. Once you have it, you can use bulk get to fetch the values with:
POST http://...:../db/_all_docs?include_docs=true
with keys=["A", "B", "C"]
This is how I'd do it.
The second approach you could use, is to have a map function which emits all possible subsets of the doc.tags. (Find all possible subset combos in an array?). With the index structure like this, you can get the desired documents with a single query simply using:
key=["Smartphone", "Tablet"] and include_docs=true.
However keep in mind that this means emitting 2**n (n - number of tags) rows for you view, so use this approach only if you are sure that there is only few of them for each doc.

Resources