Storing and querying PostgreSQL database entities with multiple related entities? - node.js

Designing a PostgreSQL database that will be queried by a Node API using Sequelize. Currently, I have a table called recipes that has columns called ingredients and instructions. Those columns are stored for a given as an array of strings like {Tomatoes, Onions}.
That method of storage worked fine for simply fetching and rendering a recipe on the client side. But it wasn't working well for fuzzy search querying because, using Sequelize all I could do was ingredients: { [Op.contains] : [query] }. So if a user typed tomatoes there was no way to write a "fuzzy" search query that would return a recipe with an ingredient Tomatoes.
And then I read this in the PostgreSQL documentation:
Arrays are not sets; searching for specific array elements can be a sign of database misdesign. Consider using a separate table with a row for each item that would be an array element. This will be easier to search, and is likely to scale better for a large number of elements.
Now I'm considering storing ingredients and instructions as separate tables, but I have a couple of questions.
1) As a recipe can have multiple ingredients related to it, should I just use a foreign key for each ingredient and the Sequelize hasMany relationship? That seems correct to me, except that I'm now potentially duplicating common ingredients each time a new recipe is created that uses that ingredient.
2) What would be the best way to write a fuzzy search query so that a user could search the main columns of the recipes table (e.g. title, description) and additionally apply their query to the instructions and ingredients tables?
Essentially I'd like to end up with a fuzzy search query applied to the three tables that looks something like this...
const recipes = await req.context.models.Recipe.findAll({
where: {
[Op.or]: [
{ title: { [Op.iLike]: '%' + query + '%' } },
{ description: { [Op.iLike]: '%' + query + '%' } },
{ ingredients: { ingredient: { [Op.iLike]: '%' + query + '%' } } },
{ instructions: { instruction: { [Op.iLike]: '%' + query + '%' } } }
]
}
});
Thanks!

I have done this, i happen to use graphql in my node layer with sequelize, and i have filter objects that do this type of thing. You'll just need some include statements in your Recipie.findAll.. after your initial where clause where you evaluate whether you are searching title or description or both type thing. i sent my search params in with prefix's i could strip off that told me what sequelize op's i would want to use on them and just ran my args through a utility method to create my where clause, but i know there are many ways to skin that cat. i just did not want to clutter up my resolvers with tonnes of hardcoded ops and conditional clauses was all.... your include might look something like this
include: [{
model: models.Ingredient,
as: 'Ingredients',
through: { some join table specifying keys where necessary since this
is many to many }
where: {some conditional code around your search param},
}, {
model: models.Instruction,
as: 'Instructions',
where: {some conditional code around your search param},
}],
There is good documentation around multiple includes, or nested includes in the sequelize docs, but from what i see above you have a fairly good understanding of what you need to do. To uncomplicate things a bit, i'd start with just searching on your fields from recipie (title, description) before you add the includes and get that working, then it will be a little clearer how you want to form your where clauses.
alternativley.. you can skip the includes and write associations in your models and call them with getters and pass the where clauses to those... i do that as well and again well documented stuff now.. Sequelize has really upped their game
Recipie.associate = function (models) {
models.Recipie.hasMany(models.Ingredient, { as: 'Ingredients', through: "recipie_ingredient" foreignKey: 'recipie_id'});
};
now you have a getter for Ingredients, and if you declare belongsToMany targetting back at Recipie in the Ingredient model then you'll have a getter there as well, and you can pass your search string to that via where clause and get all recipies that have a given ingredient or ingredient list type thing.... Clear as mud?

Related

Search string value inside an array of objects inside an object of the jsonb column- TypeORM and Nest.js

the problem I am facing is as follows:
Search value: 'cooking'
JSON object::
data: {
skills: {
items: [ { name: 'cooking' }, ... ]
}
}
Expected result: Should find all the "skill items" that contain 'cooking' inside their name, using TypeORM and Nest.js.
The current code does not support search on the backend, and I should implement this. I want to use TypeORM features, rather than handling it with JavaScript.
Current code: (returns data based on the userId)
const allItems = this.dataRepository.find({ where: [{ user: { id: userId } }] })
I investigated the PostgreSQL documentation regarding the PostgreSQL functions and even though I understand how to create a raw SQL query, I am struggling to convert this to the TypeORM equivalent.
Note: I researched many StackOverflow issues before creating this question, but do inform me If I missed the right one. I will be glad to investigate.
Can you help me figure out the way to query this with TypeORM?
UPDATE
Let's consider the simple raw query:
SELECT *
FROM table1 t
WHERE t.data->'skills' #> '{"items":[{ "name": "cooking"}]}';
This query will provide the result for any item within the items array that will match exact name - in this case, "cooking".
That's totally fine, and it can be executed as a raw request but it is certainly not easy to maintain in the future, nor to use pattern matching and wildcards (I couldn't find a solution to do that, If you know how to do it please share!). But, this solution is good enough when you have to work on the exact matches. I'll keep this question updated with the new findings.
use Like in Where clause:
servicePoint = await this.servicePointAddressRepository.find({
where: [{ ...isActive, name: Like("%"+key+"%"), serviceExecutive:{id: userId} },
{ ...isActive, servicePointId: Like("%"+key+"%")},
{ ...isActive, branchCode: Like("%"+key+"%")},
],
skip: (page - 1) * limit,
take: limit,
order: { updatedAt: "DESC" },
relations:["serviceExecutive","address"]
});
This may help you! I'm matching with key here.

Usage of TSVECTOR and to_tsquery to filter records in Sequelize

I've been trying to get full search text to work for a while now without any success. The current documentation has this example:
[Op.match]: Sequelize.fn('to_tsquery', 'fat & rat') // match text search for strings 'fat' and 'rat' (PG only)
So I've built the following query:
Title.findAll({
where: {
keywords: {
[Op.match]: Sequelize.fn('to_tsquery', 'test')
}
}
})
And keywords is defined as a TSVECTOR field.
keywords: {
type: DataTypes.TSVECTOR,
},
It seems like it's generating the query properly, but I'm not getting the expected results. This is the query that it's being generated by Sequelize:
Executing (default): SELECT "id" FROM "Tests" AS "Test" WHERE "Test"."keywords" ## to_tsquery('test');
And I know that there are multiple records in the database that have 'test' in their vector, such as the following one:
{
"id": 3,
"keywords": "'keyword' 'this' 'test' 'is' 'a'",
}
so I'm unsure as to what's going on. What would be the proper way to search for matches based on a TSVECTOR field?
It's funny, but these days I am also working on the same thing and getting the same problem.
I think part of the solution is here (How to implement PostgresQL tsvector for full-text search using Sequelize?), but I haven't been able to get it to work yet.
If you find examples, I'm interested. Otherwise as soon as I find the solution that works 100% I will update this answer.
What I also notice is when I add data (seeds) from sequelize, it doesn't add the lexemes number after the data of the field in question. Do you have the same behavior ?
last thing, did you create the index ?
CREATE INDEX tsv_idx ON data USING gin(column);

Unaccent in Sequelize

I'm currently working in a project that uses ExpressJS, PostgreSQL and Sequelize as the ORM. I developed a search function that makes a query that searches items by name:
models.foo.findAll({
where: {
$or: [
{name: {$ilike: keywords}},
{searchMatches: {$contains: [keywords]}}
]
},
order: [['name', 'ASC']]
})
This works fine, but if the name contains an special character (like á, é, í, ó or ú) this query won't find it.
Is there a way to make the query search names with speacial characters in a meaningful sense? Like if I search the name "potato" the results "The potato", "Da potátos" and "We are the pótatóes" will come out, but not "We eat pátatos" (since á != o)
This can now be done without a completely RAW query, but using Sequelize's in built functions:
models.foo.findAll({
where: Sequelize.where(
Sequelize.fn('unaccent', Sequelize.col('name')), {
[Op.iLike]:`%${keywords}%`
}),
order: [['name', 'ASC']]
})
Then ordering, associations etc. all work still as normal :).
I finally found a valid solution. First I created the unaccent extension:
create extension unaccent;
Then I just used a raw query (I couldn't figure out how to build the query using Sequelize's way) like this:
models.sequelize.query(
`SELECT
*
FROM
"Foos"
WHERE
unaccent("name") ilike unaccent('${keywords}')
OR "searchMatches" #> ARRAY[unaccent('${keywords}')]::VARCHAR(255)[]
ORDER BY
"name" ASC`, {model: models.Foo})
And it works!
A dictionary might be what you are looking for. Can basically be used to map synonyms and exclude common elements from indexes (e.g. "a" and "the" from English text), amongst other things.
https://www.postgresql.org/docs/current/static/textsearch-dictionaries.html
In my case I solved this question using the Sequelize.literal and COLLATE that way:
where: Sequelize.literal(`name COLLATE Latin1_general_CI_AI like '%${keywords}%' COLLATE Latin1_general_CI_AI`)
That way, removing the accents on both sides.

Mongoose: How to populate 2 level deep population without populating fields of first level? in mongodb

Here is my Mongoose Schema:
var SchemaA = new Schema({
field1: String,
.......
fieldB : { type: Schema.Types.ObjectId, ref: 'SchemaB' }
});
var SchemaB = new Schema({
field1: String,
.......
fieldC : { type: Schema.Types.ObjectId, ref: 'SchemaC' }
});
var SchemaC = new Schema({
field1: String,
.......
.......
.......
});
While i access schemaA using find query, i want to have fields/property
of SchemaA along with SchemaB and SchemaC in the same way as we apply join operation in SQL database.
This is my approach:
SchemaA.find({})
.populate('fieldB')
.exec(function (err, result){
SchemaB.populate(result.fieldC,{path:'fieldB'},function(err, result){
.............................
});
});
The above code is working perfectly, but the problem is:
I want to have information/properties/fields of SchemaC through SchemaA, and i don't want to populate fields/properties of SchemaB.
The reason for not wanting to get the properties of SchemaB is, extra population will slows the query unnecessary.
Long story short:
I want to populate SchemaC through SchemaA without populating SchemaB.
Can you please suggest any way/approach?
As an avid mongodb fan, I suggest you use a relational database for highly relational data - that's what it's built for. You are losing all the benefits of mongodb when you have to perform 3+ queries to get a single object.
Buuuuuut, I know that comment will fall on deaf ears. Your best bet is to be as conscious as you can about performance. Your first step is to limit the fields to the minimum required. This is just good practice even with basic queries and any database engine - only get the fields you need (eg. SELECT * FROM === bad... just stop doing it!). You can also try doing lean queries to help save a lot of post-processing work mongoose does with the data. I didn't test this, but it should work...
SchemaA.find({}, 'field1 fieldB', { lean: true })
.populate({
name: 'fieldB',
select: 'fieldC',
options: { lean: true }
}).exec(function (err, result) {
// not sure how you are populating "result" in your example, as it should be an array,
// but you said your code works... so I'll let you figure out what goes here.
});
Also, a very "mongo" way of doing what you want is to save a reference in SchemaC back to SchemaA. When I say "mongo" way of doing it, you have to break away from your years of thinking about relational data queries. Do whatever it takes to perform fewer queries on the database, even if it requires two-way references and/or data duplication.
For example, if I had a Book schema and Author schema, I would likely save the authors first and last name in the Books collection, along with an _id reference to the full profile in the Authors collection. That way I can load my Books in a single query, still display the author's name, and then generate a hyperlink to the author's profile: /author/{_id}. This is known as "data denormalization", and it has been known to give people heartburn. I try and use it on data that doesn't change very often - like people's names. In the occasion that a name does change, it's trivial to write a function to update all the names in multiple places.
SchemaA.find({})
.populate({
path: "fieldB",
populate:{path:"fieldC"}
}).exec(function (err, result) {
//this is how you can get all key value pair of SchemaA, SchemaB and SchemaC
//example: result.fieldB.fieldC._id(key of SchemaC)
});
why not add a ref to SchemaC on SchemaA? there will be no way to bridge to SchemaC from SchemaA if there is no SchemaB the way you currently have it unless you populate SchemaB with no other data than a ref to SchemaC
As explained in the docs under Field Selection, you can restrict what fields are returned.
.populate('fieldB') becomes populate('fieldB', 'fieldC -_id'). The -_id is required to omit the _id field just like when using select().
I think this is not possible.Because,when a document in A referring a document in B and that document is referring another document in C, how can document in A know which document to refer from C without any help from B.

sorting alpha with mongoose

I'm trying to sort via mongoose 3.6.20 and I am receiving some unexpected results.
I have a list of companies with a name. At first I thought that maybe it was sorting in a case sensitive way. Which based on articles, I expect was true.
I'm now using a virtual property to down case the sort field. However, I'm still getting unexpected results.
CompanySchema.virtual('name_lower').get(function(){
return this.name.toLowerCase();
});
and when I sort
Company.find().sort({ name_lower: 1 });
I'm getting it in the following order:
company name
google
company name (yes a duplicate for testing)
I'm also outputting the value of my virtual property and it looks right. There is no whitespace or funky characters that would result in the 2nd 'company name' from appearing after google.
Using nodejs, express, mongoose.
What am I missing or doing incorrectly?
Update:
Based on the information provided in the answers, I refactored my schema to include some normalized fields and hooked into the pre save event of my document, where I update those normalized fields and sort using them in all future queries.
CompanySchema.pre('save', function(next){
this.normalized_name = this.name;
});
Next, is in the schema I use:
var CompanySchema = mongoose.Schema({
...
normalized_name: { type: String, set: normalize },
...
});
Where normalize is a function that for now, returns a lowercase version of the value passed into it. However, this allows me to expand on it later really fast, and I can quickly do the same to other fields that I might need to sort against.
As of MongoDB v3.4, case insensitive sorting can be done using collation method:
Company.find()
.collation({locale: "en" }) //or whatever collation you want
.sort({name:'asc'})
.exec(function(err, results) {
// use your case insensitive sorted results
});
Unfortunately MongoDB and Mongoose does not currently support complex sorting, so there are 2 options:
As you said, create a new field with the names sanitized to be all lowercase
Run a big for loop over all the data and update each company name to it's lower case form:
db.CompanyCollection.find().forEach(
function(e) {
e.CompanyName = e.CompanyName.toLowerCase();
db.CompanyCollection.save(e);
}
)
or
db.CompanyCollection.update({_id: e._id}, {$set: {CompanyName: e.CompanyName.toLowerCase()
Please see Update MongoDB collection using $toLower and Mongoose: Sort alphabetically as well for more info.
I want to put out that in this hook:
CompanySchema.pre('save', function(next){
this.normalized_name = this.name;
});
You'll have to call next(); at the end, if you want the normalized_name to be saved in the database, so the pre save hook would look like:
CompanySchema.pre('save', function(next){
this.normalized_name = this.name;
next();
});
This answer seems to be more helpful to me. I had to consider diacritics along with the case so I had used strength:3.
Mongoose: Sort alphabetically

Resources